6.12.2 Hard Lockup


Recommended Posts

Been having some issues with hard lockups i get a few days then the server lets go.

Today it happened again my Plex server docker logs was fulling up /run before i've tried purging the logs but they are filling really fast.

I can't seem to get rid of this message repeating endlessly in log.json.

I have tried deleting it daily but that didn't help.

This afternoon the unraid server web ui would only load the banner and nothing else. Docker was offline with all the dockers down when it hard locked up. Diags are attached not sure if there is anything in them since i had to reboot the system to export them.

VM's were still running though and reachable.

I had to power off the server.

 

Server is up for only 3 hours already at 5MB log file.

 

Filesystem      Size  Used Avail Use% Mounted on

tmpfs            32M  5.0M   28M  16% /run

 

log.json file. Nvidia drivers are the latest and the server version is as well.

{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2023-07-06T19:26:37+10:00"}
{"level":"info","msg":"Running with config:\n{\n  \"AcceptEnvvarUnprivileged\": true,\n  \"NVIDIAContainerCLIConfig\": {\n    \"Root\": \"\"\n  },\n  \"NVIDIACTKConfig\": {\n    \"Path\": \"nvidia-ctk\"\n  },\n  \"NVIDIAContainerRuntimeConfig\": {\n    \"DebugFilePath\": \"/dev/null\",\n    \"LogLevel\": \"info\",\n    \"Runtimes\": [\n      \"docker-runc\",\n      \"runc\"\n    ],\n    \"Mode\": \"auto\",\n    \"Modes\": {\n      \"CSV\": {\n        \"MountSpecPath\": \"/etc/nvidia-container-runtime/host-files-for-container.d\"\n      },\n      \"CDI\": {\n        \"SpecDirs\": null,\n        \"DefaultKind\": \"nvidia.com/gpu\",\n        \"AnnotationPrefixes\": [\n          \"cdi.k8s.io/\"\n        ]\n      }\n    }\n  },\n  \"NVIDIAContainerRuntimeHookConfig\": {\n    \"Path\": \"/usr/bin/nvidia-container-runtime-hook\",\n    \"SkipModeDetection\": false\n  }\n}","time":"2023-07-06T19:26:42+10:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2023-07-06T19:26:42+10:00"}
{"level":"info","msg":"Running with config:\n{\n  \"AcceptEnvvarUnprivileged\": true,\n  \"NVIDIAContainerCLIConfig\": {\n    \"Root\": \"\"\n  },\n  \"NVIDIACTKConfig\": {\n    \"Path\": \"nvidia-ctk\"\n  },\n  \"NVIDIAContainerRuntimeConfig\": {\n    \"DebugFilePath\": \"/dev/null\",\n    \"LogLevel\": \"info\",\n    \"Runtimes\": [\n      \"docker-runc\",\n      \"runc\"\n    ],\n    \"Mode\": \"auto\",\n    \"Modes\": {\n      \"CSV\": {\n        \"MountSpecPath\": \"/etc/nvidia-container-runtime/host-files-for-container.d\"\n      },\n      \"CDI\": {\n        \"SpecDirs\": null,\n        \"DefaultKind\": \"nvidia.com/gpu\",\n        \"AnnotationPrefixes\": [\n          \"cdi.k8s.io/\"\n        ]\n      }\n    }\n  },\n  \"NVIDIAContainerRuntimeHookConfig\": {\n    \"Path\": \"/usr/bin/nvidia-container-runtime-hook\",\n    \"SkipModeDetection\": false\n  }\n}","time":"2023-07-06T19:26:47+10:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2023-07-06T19:26:47+10:00"}
{"level":"info","msg":"Running with config:\n{\n  \"AcceptEnvvarUnprivileged\": true,\n  \"NVIDIAContainerCLIConfig\": {\n    \"Root\": \"\"\n  },\n  \"NVIDIACTKConfig\": {\n    \"Path\": \"nvidia-ctk\"\n  },\n  \"NVIDIAContainerRuntimeConfig\": {\n    \"DebugFilePath\": \"/dev/null\",\n    \"LogLevel\": \"info\",\n    \"Runtimes\": [\n      \"docker-runc\",\n      \"runc\"\n    ],\n    \"Mode\": \"auto\",\n    \"Modes\": {\n      \"CSV\": {\n        \"MountSpecPath\": \"/etc/nvidia-container-runtime/host-files-for-container.d\"\n      },\n      \"CDI\": {\n        \"SpecDirs\": null,\n        \"DefaultKind\": \"nvidia.com/gpu\",\n        \"AnnotationPrefixes\": [\n          \"cdi.k8s.io/\"\n        ]\n      }\n    }\n  },\n  \"NVIDIAContainerRuntimeHookConfig\": {\n    \"Path\": \"/usr/bin/nvidia-container-runtime-hook\",\n    \"SkipModeDetection\": false\n  }\n}","time":"2023-07-06T19:26:52+10:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2023-07-06T19:26:52+10:00"}

tower-diagnostics-20230706-1927.zip

Edited by Maticks
Link to comment

Docker was running and VM was running, and it just crashed the webui while i was working. noticed my dockers were unreachable.

I suspect it is this Plex ongoing issue with this log file filling tmpfs. Can't seem to find a way to stop it happening.

It is about 1M per hour i suspect it hits 100% and crashes everything.

 

tmpfs            32M  6.4M   26M  20% /run

 

Is there a way to disable verbose logging in docker to log.json ?

 

Only other thing i can think of is try using the old Nvidia driver see if that stops this  v530.41.03

 

Edited by Maticks
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.