Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kuberay CRDs cause crash of MultiKueue integration tests setup #3987

Open
mszadkow opened this issue Jan 16, 2025 · 0 comments
Open

Kuberay CRDs cause crash of MultiKueue integration tests setup #3987

mszadkow opened this issue Jan 16, 2025 · 0 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@mszadkow
Copy link
Contributor

mszadkow commented Jan 16, 2025

What happened:
Attempt to add CRDs of Kuberay to multikueue integration tests cause them to stuck and fail.
Integration tests CI never finishes due to reported issues.

  panic: Your Test Panicked
        callback[0]()
        /Users/michal_szadkowski/workspace/kueue/vendor/github.com/onsi/ginkgo/v2/internal/suite.go:323
          When you, or your assertion library, calls Ginkgo's Fail(),
          Ginkgo panics to prevent subsequent assertions from running.
  
          Normally Ginkgo rescues this panic so you shouldn't see it.
  
          However, if you make an assertion in a goroutine, Ginkgo can't capture the
          panic.
          To circumvent this, you should call
  
                defer GinkgoRecover()
  
          at the top of the goroutine that caused this panic.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Add CRDs of Kuberay to MultiKueue integration tests in createCluster() as another DepCRDPath.

func createCluster(setupFnc framework.ManagerSetup, apiFeatureGates ...string) cluster {
	c := cluster{}
	c.fwk = &framework.Framework{
		CRDPath:     filepath.Join("..", "..", "..", "config", "components", "crd", "bases"),
		WebhookPath: filepath.Join("..", "..", "..", "config", "components", "webhook"),
		DepCRDPaths: []string{filepath.Join("..", "..", "..", "dep-crds", "jobset-operator"),
			filepath.Join("..", "..", "..", "dep-crds", "training-operator-crds"),
			filepath.Join("..", "..", "..", "dep-crds", "mpi-operator"),
			filepath.Join("..", "..", "..", "dep-crds", "ray-operator-crds"),
		},
		APIServerFeatureGates: apiFeatureGates,
	}

Minimal reproduction scenario is available in this PR.

Anything else we need to know?:

The issue have been investigated by @mszadkow, @mbobrovskyi and @mimowo.
The only thing that makes the tests to pass was to reduce the number of INTEGRATION_NPROCS which could be described as "hacky".
Also we don't know why this workaround works.

Seems that problem is solely connected to Kuberay CRDs, there were attempts to add more CRDs from different sources, which didn't cause the same issue.

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@mszadkow mszadkow added the kind/bug Categorizes issue or PR as related to a bug. label Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant