-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes fails to create a checkpoint using CRIU #2594
Comments
@adrianreber Could you help me take a look at this? Thank you |
Don't use CentOS 7. It is EOL and was never a supported platform for CRIU. In your case you have an established TCP connection and currently the only way to handle this is by using a CRIU configuration file to tell CRIU to handle it. |
Later I will consider abandoning CentOS 7, but for now, how should I use CRIU configuration file? |
Take a look at the man page or the wiki. It is all documented there. |
I'm not quite sure which configurations should be added to the CRIU configuration file to resolve the current issue, and whether checkpoint generation through kubelet would still work after such configuration. Could you elaborate on this more detail? |
When sending a POST request to kubelet, I get the following response:
checkpointing of default/user-5c57b749d8-4bx8t/hotel-reserv-user failed (rpc error: code = Unknown desc = failed to checkpoint container 87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d: running "/usr/bin/runc" ["checkpoint" "--file-locks" "--image-path" "/var/lib/containers/storage/overlay-containers/87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d/userdata/checkpoint" "--work-path" "/var/lib/containers/storage/overlay-containers/87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d/userdata" "--leave-running" "87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d"] failed:
/usr/bin/runc --root /run/runc --systemd-cgroup checkpoint --file-locks --image-path /var/lib/containers/storage/overlay-containers/87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d/userdata/checkpoint --work-path /var/lib/containers/storage/overlay-containers/87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d/userdata --leave-running 87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7dfailed: time="2025-02-10T01:38:52+08:00" level=error msg="criu failed: type NOTIFY errno 0\nlog file: /var/lib/containers/storage/overlay-containers/87ea0f1e6cf80fca7c22afee97cbea6ba749686c6942a0e86d22b42523562c7d/userdata/dump.log"
Strangely, when I try to create Checkpoint with a simple Nginx application, most attempts succeed, but sometimes I encounter this error. Could this be an environmental issue?
Here is dump.log:
dump.log
Here is some potentially useful environment information:
CRIU Version: 3.17.1:
criu check --all -v4
Other env info
The text was updated successfully, but these errors were encountered: