DogeKitteh007 Posted July 26, 2022 Share Posted July 26, 2022 (edited) Hello, I haven't rebooted the server yet, but it seems to be stuck on "Array Stopping•Retry unmounting disk share(s)..." when attempting to stop the array. Could be I have a dead nvme drive on my hands? Seems like unraid is having a hard time getting it to respond. Attached syslog and docker logfiles. syslog.txt docker.txt Edited July 27, 2022 by DogeKitteh007 Quote Link to comment
trurl Posted July 26, 2022 Share Posted July 26, 2022 attach diagnostics to your NEXT post in this thread Quote Link to comment
DogeKitteh007 Posted July 26, 2022 Author Share Posted July 26, 2022 Just now, trurl said: attach diagnostics to your NEXT post in this thread unraid-diagnostics-20220726-2216-anon.zip Quote Link to comment
DogeKitteh007 Posted July 26, 2022 Author Share Posted July 26, 2022 Some extra info: I noticed it because suddenly none of my dockers were responding, and in the logs it said there was corruption in the image. Sorry about not posting the complete diagnostics btw. Quote Link to comment
Solution trurl Posted July 26, 2022 Solution Share Posted July 26, 2022 Both docker.img and cache are readonly due to corruption. You can recreate docker.img after you fix cache. You may have to copy what you can from cache and reformat it. Have you done memtest recently? 1 Quote Link to comment
DogeKitteh007 Posted July 26, 2022 Author Share Posted July 26, 2022 2 minutes ago, trurl said: Both docker.img and cache are readonly due to corruption. You can recreate docker.img after you fix cache. You may have to copy what you can from cache and reformat it. Have you done memtest recently? No, I'm afraid not. I was just about to RMA the drive, but I'll see about forcing a reboot and doing that memtest instead. Don't have ECC-memory on the server unfortunately. Quote Link to comment
DogeKitteh007 Posted July 26, 2022 Author Share Posted July 26, 2022 Had to download the latest image from memtest86+ and make a USB. I'll leave it on and check on it later tonight. For some reason, every time I selected the MemTest option in the unRaid menu, it just rebooted and I was right back where I started. Thank you for taking the time, @trurl. 👍 Quote Link to comment
DogeKitteh007 Posted July 26, 2022 Author Share Posted July 26, 2022 Update: I've done two full rounds of memtest. Result: PASS Run all basic tests on the nvme-drive in question with Seatools X. Result: PASS What could be the reason behind this corruption and do you have any advice on how to proceed from here? Quote Link to comment
JorgeB Posted July 27, 2022 Share Posted July 27, 2022 Jul 26 12:51:19 unraid kernel: BTRFS error (device nvme0n1p1): block=230151340032 write time tree block corruption detected Btrfs went read only because it detected corruption before writing the data to the device, this is usually bad RAM or something else causing kernel memory corruption. P.S. also saw some macvlan call traces, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)) 1 Quote Link to comment
DogeKitteh007 Posted July 27, 2022 Author Share Posted July 27, 2022 5 hours ago, JorgeB said: Jul 26 12:51:19 unraid kernel: BTRFS error (device nvme0n1p1): block=230151340032 write time tree block corruption detected Btrfs went read only because it detected corruption before writing the data to the device, this is usually bad RAM or something else causing kernel memory corruption. P.S. also saw some macvlan call traces, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)) 16 hours ago, trurl said: Both docker.img and cache are readonly due to corruption. You can recreate docker.img after you fix cache. You may have to copy what you can from cache and reformat it. Have you done memtest recently? Wonder what could be causing this. The RAM checked out fine and everything is running stock speeds. Cache disk also seemed to be ok. I like it better if a specific component just dies. At least then I know what the cause is! 😅 So, in conclusion: - Copy what I can from cache drive if it is accessible after startup (appdata is most important here) - Recreate docker.img on a separate ssd this time - Reformat cache drive(?) and try using the same drive since the hw-tests came out OK - Switch to ipvlan (thanks @JorgeB) Quote Link to comment
JorgeB Posted July 27, 2022 Share Posted July 27, 2022 Cache should mount normally after a reboot, backup anything you need, not sure re-formatting will help for this, but it won't hurt. Quote Link to comment
DogeKitteh007 Posted July 27, 2022 Author Share Posted July 27, 2022 9 hours ago, JorgeB said: Cache should mount normally after a reboot, backup anything you need, not sure re-formatting will help for this, but it won't hurt. I ended up just deleting the docker.img and ran a scrub on the cache drive afterwards. It didn't find any errors. After that I let the CA Backup / Restore Appdata plugin do it's thing and restarted docker. Seems to be working fine again now. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.