VRx Posted May 22, 2023 Posted May 22, 2023 I have the following problem from time to time. It happens that the problem occurs two days in a row, and then does not occur for a month. wolverine-diagnostics-20230522-1837.zip Quote
JorgeB Posted May 22, 2023 Posted May 22, 2023 You are getting multiple out of memory errors, try limiting more the RAM for VMs and/or docker containers, the problem is usually not just about not enough RAM but more about fragmented RAM, alternatively a small swap file on disk might help, you can use the swapfile plugin: https://forums.unraid.net/topic/109342-plugin-swapfile-for-691/ Quote
VRx Posted May 22, 2023 Author Posted May 22, 2023 OOM is within one container due to the memory limit for this container (8GB), the entire server is 64GB and is far from using all the operational memory. Quote
VRx Posted May 23, 2023 Author Posted May 23, 2023 13 hours ago, JorgeB said: The OOM errors are for the server. And ? After all, containers don't have their own kernel, so how would they have an oom killer. I do not see the connection. There is a limited ram for the container, the container has reached the limit, the process of this container has been annihilated by the oom-killer, everything was done according to the assumptions of docker. Did you find somewhere in the logs that docker daemon was stopped by oom-killer? Quote
VRx Posted June 12, 2023 Author Posted June 12, 2023 Another crash, this time the container that was using all the RAM allocated to it was completely disabled. There is no information about OOM in the logs. I'm curious about your conclusions now Mr @JorgeB wolverine-diagnostics-20230612-0726.zip Quote
JorgeB Posted June 12, 2023 Posted June 12, 2023 Btrfs is detecting data corruption on cache: Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 1 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 2 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 1 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 2 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 1 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 2 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 1 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Jun 12 04:23:54 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78451 off 2202677248 csum 0xcee5889b expected csum 0x1d96fbcb mirror 2 Jun 12 04:23:54 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Jun 12 04:24:34 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78599 off 5849088 csum 0x6178d7d2 expected csum 0xb20ba482 mirror 1 Jun 12 04:24:34 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 Jun 12 04:24:34 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78599 off 5849088 csum 0x6178d7d2 expected csum 0xb20ba482 mirror 2 Jun 12 04:24:34 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 Jun 12 04:25:14 Wolverine kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 78754 off 4096 csum 0xf372d89d expected csum 0x2001abcd mirror 1 Jun 12 04:25:14 Wolverine kernel: BTRFS error (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Start by running memtest then run a correcting scrub and look for uncorrectable errors. Quote
VRx Posted June 12, 2023 Author Posted June 12, 2023 6 hours ago, JorgeB said: Btrfs is detecting data corruption on cache: Oh damn but docker as well as qemu doesn't touch the cache at all. Ddoes this mean btrfs errors are for the server? Quote
JorgeB Posted June 12, 2023 Posted June 12, 2023 Btrfs detecting data corruption is usually the sign of o more serious issue, like for example bad RAM, and that can basically make everything crash. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.