Baskedk Posted October 13, 2020 Share Posted October 13, 2020 Hey everyone. Last night my unraid server got all crazy on me and decided to do read errors and mess with my dockers and VMs. I have no idea where to start on figuring out what caused this. And my cache is apparently full of BTRFS errors: Oct 13 07:39:02 Jinx kernel: print_req_error: I/O error, dev sdb, sector 5663456 Oct 13 07:39:02 Jinx kernel: BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 350, rd 905675, flush 0, corrupt 0, gen 0 Anyone got the knowledge and time to look through my diagnostics and figure out what the root problem to this is, it would be greatly appreciated. Here is a few visual representations of the problem from the gui Array: Docker: VMs I had 5-6 VMs yesterday.... I kind off regret that I switched to Ryzen. Have had so many problems with the system since I switched over. Anyone knows if this could be related. Is there compatibility problems on the AMD side? Thanks in advance lovely community. Best Regards Baskedk jinx-diagnostics-20201013-0728.zip Quote Link to comment
Baskedk Posted October 13, 2020 Author Share Posted October 13, 2020 A reboot of the server, got everything working again. At least it looks like it for now. Still very curious on what could cause this. If it can happen out of the blue like that, it can happen again...... And that kind of stability is not something I'm a big fan of. Quote Link to comment
ChatNoir Posted October 13, 2020 Share Posted October 13, 2020 Your BTRFS errors are a different problem than the array errors you are highlighting. Your BRTFS errors seem to be on sdb and/or sdc (your cache drives) and your array is formatted in XFS. I cannot see your SMART diagnostics for any drive I checked. Quote A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. Can you run SMART manually on the drives with errors ? I'd say sdd & sde + maybe sdb & sdc to be sure. But there is sure a lot of errors and not just the one you quoted. I cannot help you much on this, lets see what the others can propose. Quote Link to comment
JorgeB Posted October 13, 2020 Share Posted October 13, 2020 Problem with the onboard SATA controller: Oct 12 23:36:45 Jinx kernel: ahci 0000:01:00.1: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000da8d0000 flags=0x0000] Quite common with Ryzen boards, there are reports that updating to the latest beta helps, due to newer kernel, you can also disable IOMMU if not needed. Quote Link to comment
tjb_altf4 Posted October 13, 2020 Share Posted October 13, 2020 1 hour ago, JorgeB said: Quite common with Ryzen boards, there are reports that updating to the latest beta helps, due to newer kernel, you can also disable IOMMU if not needed. Video on that issue, with some possible fixes linked in their forum thread (in description) Quote Link to comment
Baskedk Posted October 13, 2020 Author Share Posted October 13, 2020 I don't even know what IOMMU is, so I have no idea if I need it hehe. I can not seem to find a newer version of my bios. And no beta versions at all. It's a 'Asus Prime B450M-A'. But if all this nonsense is sata controller related, can anyone recommend a good reliable AM4 motherboard to use for unRAID then? My life is to short for these kind of random data failures all the time. And if this can be solved with another mobo, that's my goto. Quote Link to comment
Baskedk Posted October 13, 2020 Author Share Posted October 13, 2020 Just now, tjb_altf4 said: Video on that issue, with some possible fixes linked in their forum thread (in description) thx, i'll give it a watch 👍 Quote Link to comment
Baskedk Posted October 13, 2020 Author Share Posted October 13, 2020 1 hour ago, JorgeB said: Problem with the onboard SATA controller: Oct 12 23:36:45 Jinx kernel: ahci 0000:01:00.1: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000da8d0000 flags=0x0000] Quite common with Ryzen boards, there are reports that updating to the latest beta helps, due to newer kernel, you can also disable IOMMU if not needed. Ahh, you meant the unRAID OS beta, and not a beta bios 🙄 I'll try that out and see if the kernel makes the difference 👍 Quote Link to comment
Baskedk Posted October 20, 2020 Author Share Posted October 20, 2020 Just to follow up on my issue, it seems that the beta update stopped my disk errors. My log is not flooded with errors anymore, and every thing seems to be in order. Glad i didn't move to ryzen earlier, when that option was not around 👍 Thanks @JorgeB for pointing me in the right direction 😎 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.