October 13, 2025Oct 13 Something happened last night that basically caused all my containers to fail. Initially, while investigating the problem, I tried to shutdown the array but it was stuck on "stopping". The logs indicated that it was one of my cache disks that couldn't be unmounted. During this process I read somewhere that it could be because there wasn't enough space for docker allocated. So I increased it--although the system was already in a weird state and I probably shouldn't have done that. Well it didn't help. So I ended up shutting the system down and starting it again. When booted I started the array and then ran the CA Appdata Backup plugin because I think that's what originally running when it failed last night. It worked for a few containers then it started to fail and failed consistently from then on. The docker service was then rendered kaput and was in basically the same state as it was earlier. Stopping the array didn't work.Oct 12 12:46:05 Unraid kernel: BTRFS error (device loop2 state A): Transaction aborted (error -5) Oct 12 12:46:05 Unraid kernel: BTRFS: error (device loop2 state A) in do_free_extent_accounting:2983: errno=-5 IO failure Oct 12 12:46:05 Unraid kernel: BTRFS info (device loop2 state EA): forced readonly Oct 12 12:46:05 Unraid kernel: BTRFS error (device loop2 state EA): failed to run delayed ref for logical 1778130944 num_bytes 4096 type 178 action 2 ref_mod 1: -5 Oct 12 12:46:05 Unraid kernel: BTRFS: error (device loop2 state EA) in btrfs_run_delayed_refs:2215: errno=-5 IO failure Oct 12 12:46:05 Unraid kernel: traps: postgres[406881] general protection fault ip:153cbbc32507 sp:7ffe29db9b50 error:0 in libc.so.6[28507,153cbbc32000+165000] Oct 12 12:46:05 Unraid kernel: traps: postgres[439417] general protection fault ip:1543b77b6a1a sp:7ffee4672fa0 error:0 in ld-musl-x86_64.so.1[41a1a,1543b7789000+57000]I think this is a clue but I don't know what it is that I should do from here. Any help is appreciated![edit]This shows some unallocated space on my cache drive but I don't know if it's related? Edited October 13, 2025Oct 13 by pr0x1ma Add screenshot
October 13, 2025Oct 13 Community Expert Your docker.img is likely corrupted not the cache pool.Please post diagnostics Edited October 13, 2025Oct 13 by MowMdown
October 13, 2025Oct 13 Author Here’s the diagnostics. One other thing is that when I was on the command line when the docker wasn’t working I was getting an input/output error when running something as simple as “la -la”. unraid-diagnostics-20251012-2042.zip
October 13, 2025Oct 13 Community Expert Your cache drive is having read/write errors and your docker.img is corrupted.
October 13, 2025Oct 13 Author Your cache drive is having read/write errorsWhat do I do about this?and your docker.img is corrupted.Is this fixable?
October 13, 2025Oct 13 Community Expert what's weird is the most recent log isn't showing issues with the cache drive /dev/nvme0n1 but the other previous logs do.you probably want to run a smart test on the cache drive. Im not sure if the drive is failing or just needs to be re-formatted. I would prepare for the worst and backup the data in case the drive is dying.Recreating the docker.img is simple and can be done at any time by stopping docker service, deleting the docker.img file and starting the service back up. It will auto generate a new docker.img. The exception is if you use the Nextcloud AIO container you will lose the data in this specific container. Otherwise you will just need to reinstall the conatiner images by going to Community Apps and clicking on "previous apps" and then checking the box for each container you had installed it will go through and install them for you and get you back up and running. You shouldn't need to reconfigure any containers just any custom docker networks you may have created. Edited October 13, 2025Oct 13 by MowMdown
October 13, 2025Oct 13 Author I have the original syslog as well which shows the I/O errors starting right after midnight (see attached).syslog.1.b.txtThis is what the extended smart test came back with (spoiler: no errors)Samsung_SSD_990_PRO_2TB_S7KHNU0X633796F-20251012-2120.txtIs it possible that a corrupted docker.img could cause the I/O errors? I'm pretty sure that the "system-previous" log is the log after I rebooted the first time (I rebooted twice) and then ran the appdata backup plugin which produced the original error. Edited October 13, 2025Oct 13 by pr0x1ma
October 13, 2025Oct 13 Community Expert The NVMe device dropped offline before:Oct 12 12:46:05 Unraid kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1Oct 12 12:46:05 Unraid kernel: nvme nvme0: Disabling device after reset failure: -19You can try this, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offe.g.:append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offReboot, recreate the docker image and see if it makes a difference.
October 15, 2025Oct 15 Author Ok so I set those settings and recreated the docker.img. Everything seems to be working just fine for the last day or so but I don’t know how to recreate the issue so I suppose I’ll just have to wait and see if it does it again. Thanks for your help!
January 8Jan 8 Author Alright so it's been a while. I ended up buying a different SSD. However, now the new SSD is also giving me BTRFS errors. Like last time, it encountered it while performing a backup. Is this due to my RAM @JorgeB ?Jan 8 04:04:31 Unraid Docker Auto Update: Community Applications Docker Autoupdate finished Jan 8 04:14:18 Unraid emhttpd: read SMART /dev/sdi Jan 8 04:25:54 Unraid emhttpd: read SMART /dev/sdb Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 112 (c070) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 113 (a071) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:532480 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 114 (5072) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:524288 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 115 (e073) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:524288 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 116 (a074) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:524288 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 117 (6075) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:540672 Jan 8 04:27:22 Unraid kernel: nvme nvme0: I/O tag 75 (904b) opcode 0x0 (I/O Cmd) QID 4 timeout, aborting req_op:FLUSH(2) size:0 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3385720960, 1024 blocks, I/O Error (sct 0x0 / sc 0x7) Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: I/O error, dev nvme0n1, sector 3385720960 op 0x0:(READ) flags 0x84700 phys_seg 128 prio class 2 Jan 8 04:27:22 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3385721984, 1056 blocks, I/O Error (sct 0x0 / sc 0x7) Jan 8 04:27:22 Unraid kernel: I/O error, dev nvme0n1, sector 3385721984 op 0x0:(READ) flags 0x80700 phys_seg 128 prio class 2 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:22 Unraid kernel: nvme nvme0: Abort status: 0x0 Jan 8 04:27:25 Unraid kernel: nvme nvme0: I/O tag 847 (734f) opcode 0x2 (I/O Cmd) QID 11 timeout, aborting req_op:READ(0) size:57344 Jan 8 04:27:26 Unraid kernel: nvme nvme0: I/O tag 998 (f3e6) opcode 0x2 (I/O Cmd) QID 5 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:32 Unraid kernel: nvme nvme0: I/O tag 13 (a00d) opcode 0x2 (I/O Cmd) QID 9 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:32 Unraid kernel: nvme nvme0: I/O tag 666 (629a) opcode 0x2 (I/O Cmd) QID 2 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:38 Unraid kernel: nvme nvme0: I/O tag 364 (316c) opcode 0x9 (I/O Cmd) QID 3 timeout, aborting req_op:DISCARD(3) size:7573504 Jan 8 04:27:40 Unraid kernel: nvme nvme0: I/O tag 5 (7005) opcode 0x2 (I/O Cmd) QID 10 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:52 Unraid kernel: nvme nvme0: I/O tag 112 (d070) opcode 0x2 (I/O Cmd) QID 1 timeout, aborting req_op:READ(0) size:16384 Jan 8 04:27:52 Unraid kernel: nvme nvme0: I/O tag 113 (a071) opcode 0x2 (I/O Cmd) QID 1 timeout, reset controller Jan 8 04:29:16 Unraid kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3385717872, 1040 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 3385717872 op 0x0:(READ) flags 0x80700 phys_seg 128 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3385718912, 1024 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 3385718912 op 0x0:(READ) flags 0x84700 phys_seg 128 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 3385719936, 1024 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 3385719936 op 0x0:(READ) flags 0x80700 phys_seg 128 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 802484856, 32 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 802484856 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 786791032, 112 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 786791032 op 0x0:(READ) flags 0x80700 phys_seg 14 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme0n1: I/O Cmd(0x2) @ LBA 432564896, 64 blocks, I/O Error (sct 0x3 / sc 0x71) Jan 8 04:29:16 Unraid kernel: I/O error, dev nvme0n1, sector 432564896 op 0x0:(READ) flags 0x80700 phys_seg 8 prio class 2 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:16 Unraid kernel: nvme nvme0: Abort status: 0x371 Jan 8 04:29:36 Unraid kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1 Jan 8 04:29:36 Unraid kernel: nvme nvme0: Disabling device after reset failure: -19
January 8Jan 8 Community Expert This is typically not RAM-related; most often a compatibility issue between the board, the NVMe device, and the kernel. This can sometimes help:On the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offe.g.:append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offReboot and see if it makes a difference.
January 8Jan 8 Author This is what I had in the configuration when it failed:append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off acpi_enforce_resources=lax
January 8Jan 8 Community Expert In that case, recommend looking for a BIOS update or trying a different brand/model device (or board)
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.