Brandonb1987 Posted April 12, 2019 Share Posted April 12, 2019 (edited) I've had quite a few issues with my unraid the last few weeks. First drive 1 pooped itself and became read-only. Managed to get everything off it and put the data onto other disks before removing the drive and resetting the pool without it. After this was completed, I ran a scrub on all the disks (15/16 disks now) and found 5 files that were not correctable. I have since replaced/removed the files in favour of good copies. Hopefully this is fixed. I felt it was prudent to do a filesystem check on all the drives today since i've had so many problems and now I'm getting a few errors on 3/15 disks. What would be the best way to fix this? Should I do the same as I did to drive 1 (pull data off and place onto the good disks and reformat the disk)? Or can I simply pull the disk, format it, and reinstall and let it rebuild from parity? Part 2 of this issue: After having a read through the log, it appears this is really isolated to my first 8 drives which is all on 1 controller. Could the HBA be the cause for all these issues lately? Would it be a good idea to get it replaced? Edited April 12, 2019 by Brandon87 Quote Link to comment
JonathanM Posted April 12, 2019 Share Posted April 12, 2019 30 minutes ago, Brandon87 said: Or can I simply pull the disk, format it, and reinstall and let it rebuild from parity? Parity doesn't contain any files. It contains an image of the drive, filesystem included. Rebuilding will recreate file corruption exactly as it is right now. If you format the emulated drive, parity will be updated to reflect that, and rebuilding will put that empty filesystem back on the drive. If you wish specific analysis based on your hardware and current situation, you will need to attach the intact zip diagnostic file to your next post. Quote Link to comment
Brandonb1987 Posted April 12, 2019 Author Share Posted April 12, 2019 zeus-diagnostics-20190412-2347.zip Quote Link to comment
JorgeB Posted April 13, 2019 Share Posted April 13, 2019 checksum error at logical 801567584256 on dev /dev/mapper/md8 These are checksum errors, i.e., data corruption, SAS2LP is know to corrupt data (causing persistent sync errors, usually 5) in some cases, other than that bad RAM is also a likely culprit, either way you have an underlying hardware problem, I would suggest replacing the controllers with LSI HBAs in any case since those are not recommended and running memtest. Quote Link to comment
Brandonb1987 Posted April 13, 2019 Author Share Posted April 13, 2019 I’m thinking it’s the marvel based HBA as everything is happening to the first 8 drives only. That HBA is likely the culprit as you say. I’m going to order an LSI card and replace them both and run the memtest as a precaution. Hopefully that will fix it. Quote Link to comment
Brandonb1987 Posted April 14, 2019 Author Share Posted April 14, 2019 I’ve tried to do the memtest but it seems every time I try to select it, the whole system restarts and just goes back to the boot screen. Ideas? Quote Link to comment
JonathanM Posted April 14, 2019 Share Posted April 14, 2019 16 minutes ago, Brandon87 said: I’ve tried to do the memtest but it seems every time I try to select it, the whole system restarts and just goes back to the boot screen. Ideas? If the built in memtest won't run, I'd advise trying to get another version of memtest running. The version in unraid is an older open source, there is a newer program available but unraid can't package it with the OS. You'll need to prepare a different USB stick. https://www.memtest86.com/ Quote Link to comment
Squid Posted April 14, 2019 Share Posted April 14, 2019 1 hour ago, Brandon87 said: I’ve tried to do the memtest but it seems every time I try to select it, the whole system restarts and just goes back to the boot screen. Ideas? You may have to temporarily switch your BIOS to boot via the flash drive via Legacy or NON-EFI mode Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.