"unknown signal?!" reported from JobInfo terminatedSignal #26

EricR86 · 2019-05-29T19:51:23Z

Hello,

I'm not sure if this the origin of this particular bug, but I have not successfully reproduced this error on other DRMAA implementations.

I've submitted jobs to my SLURM 18.08 system where, occassionally, I get a reported "unknown signal?!". The exact same job, when resubmitted, may or may not have this issue. I cannot track down exactly what happens when this occurs or what causes this.

I have run strace on the job itself that was submitted on equivalent jobs, one which reports the "unknown signal" vs a regular exiting job and I cannot find any discernable difference and notably when tracing specifically for any signals.

sacct reports nothing unusual, and actually seems to indicate that the job exited without issue. The sysadmin for our cluster system seems to agree and cannot find any issue.

This could be a cluster-specific issue, DRMAA issue, or not. If I'm looking in the wrong place please kindly redirect me. I'm not sure where or how I could start tracking down this issue.

Thanks for your time.

The text was updated successfully, but these errors were encountered:

EricR86 mentioned this issue Aug 14, 2019

Doesn't handle cross-platform exit statuses gracefully #30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"unknown signal?!" reported from JobInfo terminatedSignal #26

"unknown signal?!" reported from JobInfo terminatedSignal #26

EricR86 commented May 29, 2019

"unknown signal?!" reported from JobInfo terminatedSignal #26

"unknown signal?!" reported from JobInfo terminatedSignal #26

Comments

EricR86 commented May 29, 2019