-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a bazel-wrapper #226049
Comments
Are you referring to accommodating remote builds (if I understood #225074 (comment) correctly, "execution" is the Bazel term for a build) using Bazel, rather than Nix? Remote builds sounds like something you're better off accomplishing using Nix than Bazel, does it not?
That sounds interesting, I haven't heard of it! Do you know what is going to be the situation with substituting Nix-managed dependencies in place of ones pinned by Bazel with the introduction of bzlmod? There probably are different opinions, but I think that ideally we'd want Bazel respect the "dependency injection" workflow similar to that of CMake
I think I understand from "Technical Details" what is needed to prepare a container for Bazel to run in. I may have misunderstood your entire post. If you begin to feel like I did, feel free to reply as if I were 5 EDIT: I'm now looking up what RBE is and looking back at tensorflow build times 🙈 |
@SomeoneSerge Upon rereading my post I realized that I was rather imprecise, apoligies 😅
Yes. Bazel. Remote execution in bazel is when you invoke Bazel locally, but use a remote server to run the actual build. E.g. you have a laptop but want to use a an 80 core machine in the cloud to speed up the build. Such setups look roughly like this: Note that noone runs any build locally since the remote executors need to be exactly the same to be able to reuse the shared build cache. But nix is already reproducible. It is so reproducible, that it is possible to recreate the build environment of the remote executor locally. This makes setups like these possible: This is an extreme example and has some security implications, but it essentially means that you could build your project locally once, push the artifacts to the remote cache and then have someone else reuse that same cache without the need for intermediary remote executors. Bazel has a new, highly experimental feature called build without bytes where you wouldn't even need to download the entire artifact cache anymore but just the leaves of the subgraph you want to rebuild. All of this is a bit difficult to get working because every tool involved in the build process needs to be exactly reproducible. Not just the compiler, also things like archivers, the java version that Bazel runs on etc. EVERYTHING 😂
I'm not entirely sure, but I think this could make it easier to get at least some hashes easier into nix. The bazel central registry already stores repos and hashes in a way that are fairly similar to those json files currently used in nix. For instance, this: https://github.com/bazelbuild/bazel-central-registry/blob/main/modules/fmt/9.1.0/source.json I could imagine that crawling bazel registries is easier to maintain than manually keeping up with those hashes.
I'm thinking something rougly like this: bazel = wrapBazelWith {
bazel = pkgs.bazel; # Or some custom built bazel
ccToolchain = pkgs.stdenv; # or cudaPackages.stdenv, or llvmPackages.stdenv
javaToolchain = pkgs.somejavasetup;
...
};
}; Then this Bazel could be passed to e.g. Something like this is already somewhat possible by e.g overriding the EDIT: We already have the RBE setup without remote executors seemingly working, but its very fragile and it'll take some time and further testing until I can push this to GitHub. |
WIP commit that implements the setup I described: eomii/rules_ll#83 Essentially this builds a container image and uses the Switching back and forth between a regular build and This uses a currently very unelegant pseudo-bazel-wrapper to aggregate the toolchains and Bazel. This could be made more flexible to be compatible with arbitrary toolchain configs/stdenvs. |
Is this related to this idea? Having something like |
Also relevant: tweag/rules_nixpkgs#180 |
Tried to get tensorflow-lite to cross-compile for someone in the matrix cross channel, but couldn't and i'm not interested in bazel so I'm dumping this here. diff --git a/pkgs/development/libraries/science/math/tensorflow-lite/default.nix b/pkgs/development/libraries/science/math/tensorflow-lite/default.nix
index 1ac08ce0cd2f..7f626827728f 100644
--- a/pkgs/development/libraries/science/math/tensorflow-lite/default.nix
+++ b/pkgs/development/libraries/science/math/tensorflow-lite/default.nix
@@ -12,10 +12,10 @@ let
bazelDepsSha256ByBuildAndHost = {
x86_64-linux = {
x86_64-linux = "sha256-61qmnAB80syYhURWYJOiOnoGOtNa1pPkxfznrFScPAo=";
- aarch64-linux = "sha256-sOIYpp98wJRz3RGvPasyNEJ05W29913Lsm+oi/aq/Ag=";
+ aarch64-linux = "sha256-WVOMYvwm6yHl3T4gS/7YWaN0CC9m1ayr3zIBQyaX6b8=";
};
aarch64-linux = {
- aarch64-linux = "sha256-MJU4y9Dt9xJWKgw7iKW+9Ur856rMIHeFD5u05s+Q7rQ=";
+ aarch64-linux = "sha256-WVOMYvwm6yHl3T4gS/7YWaN0CC9m1ayr3zIBQyaX6b8=";
};
};
bazelHostConfigName.aarch64-linux = "elinux_aarch64";
@@ -84,6 +84,11 @@ buildBazelPackage rec {
postPatch = ''
rm .bazelversion
+
+ substituteInPlace tensorflow/tools/toolchains/embedded/arm-linux/cc_config.bzl.tpl \
+ --replace '%{AARCH64_COMPILER_PATH}%/lib/gcc/aarch64-none-linux-gnu/11.3.1/include' "${lib.getDev stdenv.cc.libc}/include" \
+ --replace '%{AARCH64_COMPILER_PATH}%/aarch64-none-linux-gnu/include/c++/11.3.1/' "${lib.getDev stdenv.cc.libc}/include" \
+ --replace '%{AARCH64_COMPILER_PATH}%/bin/aarch64-none-linux-gnu-' "${stdenv.cc}/bin/${stdenv.cc.targetPrefix}"
'';
preConfigure = ''
|
I skimmed through this again after the notification, and I realize I may have misunderstood what is being proposed here. Is this issue only about using the nix-packaged Bazel outside nix-build, or is this also about fixing the effectively unusable (at least when it comes to tf, tfp, xla in nixpkgs) Btw, @aaronmondal you mention |
My understanding now is that this issue is about being able to re-use a particular configuration of Bazel (which mostly means the choice of toolchains?) between the nix-build and the dev env. Is that understanding correct? |
When I originally built the wrapper it was just about having a reproducible environment around Bazel. But now I'm quite interested in getting a better build experience for xla and jax to stick as close to upstream as possible. So I guess I'm saying that the current I'm still trying to understand the internals in nix a bit better, but so far I still think that something like a
AFAIK the lockfile contains the hashes of the sources for the bazel registry dependencies, i.e. the starlark sources of custom build rules. Note that this is still experimental and last time I checked (~3 months ago) it didn't work with nix-built bazel at all. I might be wrong though. I can very well imagine that the lockfile has a similar functionality as what you describe. Even if it doesn't it can be changed in bazel. The larger hurdle I see is that neither xla nor jax nor tf nor anyone else has yet migrated to bzlmod lol 🤣 Bzlmod makes it a lot harder to use hacky workarounds, and there are ... a few of those in xla and friends 😆
Yes I was originally talking about devenv-interoperability. But I'm starting to think that both a devenv-reusable bazelwrapper and a bazelStdenv in nixpkgs might actually be the same thing. |
Issue description
We already have
pkgs.buildBazelPackage
to build bazel packages. This can often simplify building a package, but it's a rather high-level wrapper and doesn't account for situations where we would want to build toolchains around Bazel. This wrapper can also get tricky to use for packages with complex setups like Python/GPU packages where downstream projects tend to use customrules_cc
setups to get CUDA, ROCm etc working.buildBazelPackage
also seems to be rather incompatible with remote execution.The upcoming Bazel 7 will deprecate the WORKSPACE setups that most users are still used to. It will be superceded by bzlmod, a new package management system that rather similar to Nix Flakes.
All of this makes me think that we are lacking a lower-level wrapper for Bazel, similar to the
cc-wrapper
, that provides lower-level control over the toolchains and environments that are passed to Bazel invocations.Technical details
Some notes:
Here is what Bazel needs to run:
So essentially we should be more or less fine with a C++ stdenv and Java.
In multilayered container images Bazel also needs:
pkgs.fakeNss
mkdir -m 0777 tmp
)pkgs.cacert
Related projects
Our use case at @eomii and for users of rules_ll is cache interoperability between remote execution docker images built with nix and local nix development environments. We can't use regular RBE images because they are too slow moving to reliably build upstream LLVM. Standard RBE images are also irreproducible unnecessarily bloated.
We are building the LLVM project and the ROCm/HIP stack many times per day and any build time improvements are worth it for us. So far, experiments have shown us something more along the lines of "infinity" improvements due to essentially completely skipping builds.
I believe this could also be interesting for tweag and users of rules_nixpkgs. Maybe @aherrmann, can comment on this. I've seen that
rules_nixpkgs_cc_toolchain
already interoperates with the nix stdenv. Do you think abazel-wrapper
could yield any benefits? (Ah and as side note - does your cc toolchain work with RBE? In our experiments we've generated cc toolchain configs with the bazel-toolchains tools so far, but I suspect that your implementation might actually fit our usecase better.)If it turns out that a
bazel-wrapper
is actually a useful idea, I'd be happy to implement it. @jaroeichler might also be interested in helping out with an implementation.cc @rrbutani @SomeoneSerge
The text was updated successfully, but these errors were encountered: