Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archiver for musl is not ar when in Docker image for a different architecture #1399

Open
samestep opened this issue Feb 13, 2025 · 20 comments · May be fixed by #1404
Open

Archiver for musl is not ar when in Docker image for a different architecture #1399

samestep opened this issue Feb 13, 2025 · 20 comments · May be fixed by #1404
Labels

Comments

@samestep
Copy link

samestep commented Feb 13, 2025

When I use cc natively on Linux or macOS, cc::Build::get_archiver always returns ar when I'm building for either x86 musl or ARM musl. However, when I try to do the same build in a rust Docker image, and the --platform is not my native platform, it sometimes attempts other names for the archiver program (like musl-ar or aarch64-linux-musl-ar) which cause the build to fail.

This forces me to set a TARGET_AR variable in my Dockerfile if I depend on crates like wasmtime that transitively depend on crates like zstd-sys; see this Stack Overflow question for such an example, which is itself boiled down from gradbench/gradbench#233 that was my real use case.

See this GitHub repository which includes a full code example to reproduce the issue. As also written in that README.md, here are the different values I see returned by cc::Build::get_archiver in various contexts:

  • ARM macOS
    • native: "ar"
    • native targeting x86 musl: "ar"
    • native targeting ARM musl: "ar"
    • Docker targeting x86 musl: "musl-ar"
    • Docker targeting ARM musl: "ar"
  • x86 Linux
    • native: "ar"
    • native targeting x86 musl: "ar"
    • native targeting ARM musl: "ar"
    • Docker targeting x86 musl: "ar"
    • Docker targeting ARM musl: "aarch64-linux-musl-ar"
@NobodyXu
Copy link
Collaborator

It might be that musl-ar contains some special workaround necessary for the target?

@madsmtm madsmtm added the bug label Feb 14, 2025
@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

This is kind of the expected behaviour, we check if you're cross-compiling, and attempt to use an archiver more suited for that task:

cc-rs/src/lib.rs

Line 3288 in fcf940e

} else if self.get_is_cross_compile()? {

The reason you're seeing something different on your local machine might be because you have AR, RUSTC_LINKER or similar set locally?

@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

Or actually, I think we choose musl-ar/aarch64-linux-musl-ar over ar because they exist:

cc-rs/src/lib.rs

Lines 3301 to 3302 in fcf940e

if Command::new(&target_p).output().is_ok() {
chosen = target_p;

@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

Why do the musl-ar/aarch64-linux-musl-ar binaries in your Docker container not work?

@samestep
Copy link
Author

@NobodyXu

It might be that musl-ar contains some special workaround necessary for the target?

But then I have two questions:

  1. why does everything work correctly when I set ENV TARGET_AR=ar in my Dockerfile?
  2. how am I meant to install musl-ar? As shown in that linked Stack Overflow question, it is not installed as part of apt-get install musl-tools.

@madsmtm

This is kind of the expected behaviour, we check if you're cross-compiling, and attempt to use an archiver more suited for that task:

I thought that too at first, but get_is_cross_compile just checks whether TARGET and HOST are the same, right?

cc-rs/src/lib.rs

Lines 3551 to 3558 in 15fe112

fn get_is_cross_compile(&self) -> Result<bool, Error> {
let target = self.get_raw_target()?;
let host: Cow<'_, str> = match &self.host {
Some(h) => Cow::Borrowed(h),
None => Cow::Owned(self.getenv_unwrap_str("HOST")?),
};
Ok(host != target)
}

The archiver is something other than ar in only 2/10 of the cases I listed above, whereas TARGET and HOST differ in 8/10 of the cases, as you can see in the additional output I've just added to the GitHub README I linked:

  • ARM macOS
    • native:
      TARGET = Some(aarch64-apple-darwin)
      HOST = Some(aarch64-apple-darwin)
      
    • native targeting x86 musl:
      TARGET = Some(x86_64-unknown-linux-musl)
      HOST = Some(aarch64-apple-darwin)
      
    • native targeting ARM musl:
      TARGET = Some(aarch64-unknown-linux-musl)
      HOST = Some(aarch64-apple-darwin)
      
    • Docker targeting x86 musl:
      TARGET = Some(x86_64-unknown-linux-musl)
      HOST = Some(x86_64-unknown-linux-gnu)
      
    • Docker targeting ARM musl:
      TARGET = Some(aarch64-unknown-linux-musl)
      HOST = Some(aarch64-unknown-linux-gnu)
      
  • x86 Linux
    • native:
      TARGET = Some(x86_64-unknown-linux-gnu)
      HOST = Some(x86_64-unknown-linux-gnu)
      
    • native targeting x86 musl:
      TARGET = Some(x86_64-unknown-linux-musl)
      HOST = Some(x86_64-unknown-linux-gnu)
      
    • native targeting ARM musl:
      TARGET = Some(aarch64-unknown-linux-musl)
      HOST = Some(x86_64-unknown-linux-gnu)
      
    • Docker targeting x86 musl:
      TARGET = Some(x86_64-unknown-linux-musl)
      HOST = Some(x86_64-unknown-linux-gnu)
      
    • Docker targeting ARM musl:
      TARGET = Some(aarch64-unknown-linux-musl)
      HOST = Some(aarch64-unknown-linux-gnu)
      

The reason you're seeing something different on your local machine might be because you have AR, RUSTC_LINKER or similar set locally?

I don't. But to be clear, I'm more confused about the difference between the linux/amd64 and linux/arm64 versions of the Docker image when I run it on machines of different architectures; what happens when I run locally is less important.

Or actually, I think we choose musl-ar/aarch64-linux-musl-ar over ar because they exist:

cc-rs/src/lib.rs

Lines 3301 to 3302 in fcf940e

if Command::new(&target_p).output().is_ok() {
chosen = target_p;

No; they don't exist in those Docker images, which is why the build fails when I don't add ENV TARGET_AR=ar.

Why do the musl-ar/aarch64-linux-musl-ar binaries in your Docker container not work?

As mentioned above and in the Stack Overflow answer, they don't exist in the Docker container.

@samestep
Copy link
Author

@madsmtm just to clarify, are you able to reproduce the behavior in Docker using the instructions I gave in that GitHub repository?

To run in Docker targeting x86 musl:

docker build --platform linux/amd64 . -t foo-x86 && docker run --rm foo-x86

To run in Docker targeting ARM musl:

docker build --platform linux/arm64 . -t foo-arm && docker run --rm foo-arm

@NobodyXu
Copy link
Collaborator

NobodyXu commented Feb 14, 2025

I'm more confused about the difference between the linux/amd64 and linux/arm64 versions of the Docker image when I run it on machines of different architectures;

No; they don't exist in those Docker images, which is why the build fails when I don't add ENV TARGET_AR=ar.

cc-rs/src/lib.rs

Lines 3301 to 3302 in fcf940e

if Command::new(&target_p).output().is_ok() {
chosen = target_p;

This check is for cross compilation, it seems that we are not doing cross compilation here but rather just changing HOST by using different docker image

Is that correct?

@NobodyXu
Copy link
Collaborator

NobodyXu commented Feb 14, 2025

I suspect that maybe the docker image comes with environment vsriables set for AR, because I can't find any musl- in our code, however checking the dockerfile didn't find any

Maybe you can run env inside the docker containers to see if any eng var is set?

@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

No; they don't exist in those Docker images, which is why the build fails when I don't add ENV TARGET_AR=ar.

Huh. But that's the entire reason for that check?

are you able to reproduce the behavior in Docker

Didn't yet, I don't have Docker installed locally, and it's kind of a hazzle so I was hoping to do without.

I can't find any musl- in our code, however checking the dockerfile didn't find any

See prefix_for_target.

@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

If you could provide the output of env in the container like @NobodyXu said, as well as the output of which musl-ar && which ar && which aarch64-linux-musl-ar, that'd be helpful; otherwise I'll try to get Docker set up and test it locally.

@samestep
Copy link
Author

samestep commented Feb 14, 2025

@madsmtm

If you could provide the output of env in the container like @NobodyXu said, as well as the output of which musl-ar && which ar && which aarch64-linux-musl-ar, that'd be helpful; otherwise I'll try to get Docker set up and test it locally.

Sure thing! Similar to before, I ran this command first:

$ docker build --platform linux/amd64,linux/arm64 . -t foo

Here are the outputs when running on my M1 MacBook:

$ docker run --platform linux/amd64 --rm foo env
PATH=/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=0207e7628bc5
RUSTUP_HOME=/usr/local/rustup
CARGO_HOME=/usr/local/cargo
RUST_VERSION=1.84.1
HOME=/root
$ docker run --platform linux/amd64 --rm foo which ar
/usr/bin/ar
$ docker run --platform linux/amd64 --rm foo which musl-ar
$ docker run --platform linux/amd64 --rm foo which aarch64-linux-musl-ar
$ docker run --platform linux/arm64 --rm foo env
PATH=/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=562d3a2a58a4
RUSTUP_HOME=/usr/local/rustup
CARGO_HOME=/usr/local/cargo
RUST_VERSION=1.84.1
HOME=/root
$ docker run --platform linux/arm64 --rm foo which ar
/usr/bin/ar
$ docker run --platform linux/arm64 --rm foo which musl-ar
$ docker run --platform linux/arm64 --rm foo which aarch64-linux-musl-ar

And here they are when running on my x86 Linux machine:

$ docker run --platform linux/amd64 --rm foo env
PATH=/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=bf34022b9b8e
RUSTUP_HOME=/usr/local/rustup
CARGO_HOME=/usr/local/cargo
RUST_VERSION=1.84.1
HOME=/root
$ docker run --platform linux/amd64 --rm foo which ar
/usr/bin/ar
$ docker run --platform linux/amd64 --rm foo which musl-ar
$ docker run --platform linux/amd64 --rm foo which aarch64-linux-musl-ar
$ docker run --platform linux/arm64 --rm foo env
PATH=/usr/local/cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=0f32dd3d45c9
RUSTUP_HOME=/usr/local/rustup
CARGO_HOME=/usr/local/cargo
RUST_VERSION=1.84.1
HOME=/root
$ docker run --platform linux/arm64 --rm foo which ar
/usr/bin/ar
$ docker run --platform linux/arm64 --rm foo which musl-ar
$ docker run --platform linux/arm64 --rm foo which aarch64-linux-musl-ar

As you can see, all the which commands print no output and return exit code 1, except for which ar.

@samestep
Copy link
Author

samestep commented Feb 14, 2025

@NobodyXu

This check is for cross compilation, it seems that we are not doing cross compilation here but rather just changing HOST by using different docker image

Is that correct?

Correct, I think! Because passing --platform to docker run uses emulation to change the host architecture. But, it's still cross-compilation anyways, right? Since, e.g., x86_64-unknown-linux-musl is still different from x86_64-unknown-linux-gnu, and aarch64-unknown-linux-musl is still different from aarch64-unknown-linux-gnu. Unless I misunderstand?

@madsmtm
Copy link
Collaborator

madsmtm commented Feb 14, 2025

Thanks for the output, that is indeed weird. I'm out of time for today, but I'll take a look during the weekend (if @NobodyXu doesn't get to it first ;)).

But, it's still cross-compilation anyways, right? Since, e.g., x86_64-unknown-linux-musl is still different from x86_64-unknown-linux-gnu, and aarch64-unknown-linux-musl is still different from aarch64-unknown-linux-gnu.

Yeah, we're still cross-compiling here.

@NobodyXu
Copy link
Collaborator

Correct, I think! Because passing --platform to docker run uses emulation to change the host architecture. But, it's still cross-compilation anyways, right? Since, e.g., x86_64-unknown-linux-musl is still different from x86_64-unknown-linux-gnu, and aarch64-unknown-linux-musl is still different from aarch64-unknown-linux-gnu. Unless I misunderstand?

While it's technically a cross compilation, cc-rs cannot detect it as rustc host has been changed with the target.

@samestep
Copy link
Author

@NobodyXu hmm... I'm not sure I quite understand, could you explain? Cross-compilation just means the target and host are different, and they are different in this case because one is *-musl and the other is *-gnu, right?

@samestep
Copy link
Author

And all those TARGET = Some(...) and HOST = Some(...) outputs are printed directly by cc, so I don't follow what you mean when you say it wouldn't be able to detect it.

@NobodyXu
Copy link
Collaborator

Oh sorry I got confused😂so you are compiling musl on glibc env, yes that's always cross compilation

@NobodyXu
Copy link
Collaborator

cc @samestep can you try #1404 to see if it fixed the bug please?

@samestep
Copy link
Author

samestep commented Feb 15, 2025

@NobodyXu yes, that works! But I still don't understand how this bug happened in the first place; why does the rust Docker image for linux/amd64 only have this issue on my Mac and not my Linux machine, and similarly the linux/arm64 image only has this issue on my Linux machine and not my Mac? How is cc able to tell the difference between those two environments when inside of a Docker container?

@NobodyXu
Copy link
Collaborator

I suspect this is an implementation difference.

If you check the PR, the only thing I've added is the exit status check.

Before the PR we just check if spawning the process is alright, afterwards we also check exit status of the process spawned.

So if Command look for the binary and raise an error before spawning, then our original code works.

If it looks for the binary after spawning the process then it does not work.

This might be the difference between fork + exec and vfork + exec, vfork requires the binary to be located before spawning.

@samestep you can try using strace to find out if fork or vfork is used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants