Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-reproducibility in PF GPU clustering in 12834.42[23] workflows #47233

Open
makortel opened this issue Jan 31, 2025 · 9 comments
Open

Non-reproducibility in PF GPU clustering in 12834.42[23] workflows #47233

makortel opened this issue Jan 31, 2025 · 9 comments

Comments

@makortel
Copy link
Contributor

PR GPU tests in #47226 (comment) showed differences in workflow 12834.422 in ParticleFlow/PFClusterV such as

Image

And in workflow 12834.423 in ParticleFlow/PFClusterV as in above and also in ParticleFlow/pfClusterHBHEAlpakaV such as

Image

@makortel
Copy link
Contributor Author

assign heterogeneous

@makortel
Copy link
Contributor Author

@jsamudio

@cmsbuild
Copy link
Contributor

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @makortel.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

#47227 (comment) showed in 12834.422 ParticleFlow/PFClusterV different differences

Image

@jsamudio
Copy link
Contributor

I believe the plots in ParticleFlow/pfClusterHBHEAlpakaV look about as expected between the legacy format and the Alpaka version. IIRC and if the configuration in github is to be believed then this is the comparison it is doing. See slide 7 in http://cds.cern.ch/record/2898660. For plots in ParticleFlow/PFClusterV, I am not so sure.

@makortel
Copy link
Contributor Author

I believe the plots in ParticleFlow/pfClusterHBHEAlpakaV look about as expected between the legacy format and the Alpaka version

The problem is that these plots are different between different executions (whatever that meant in the tests of those PRs). The RelMon doesn't seem to show very well what the differences really are for 2D plots, such as pfCluster_RecHitMultiplicity_GPUvsCPU, but at least the number of entries seems to be different (1376 vs 1161).

@jsamudio
Copy link
Contributor

I believe the plots in ParticleFlow/pfClusterHBHEAlpakaV look about as expected between the legacy format and the Alpaka version

The problem is that these plots are different between different executions (whatever that meant in the tests of those PRs). The RelMon doesn't seem to show very well what the differences really are for 2D plots, such as pfCluster_RecHitMultiplicity_GPUvsCPU, but at least the number of entries seems to be different (1376 vs 1161).

Okay I see now. I think I need to do some testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants