shaunmccloud Posted June 6, 2023 Share Posted June 6, 2023 I've had unraid running for a couple of years now, never had a single issue when running a parity check. The latest parity check showed 900+ errors. Ran a correcting parity check then another non-correcting one and the same amount of errors even though the correcting one fixed said errors. Do I have a parity drive dying? Diagnostics is attached. bb-8-diagnostics-20230606-0905.zip Quote Link to comment
JorgeB Posted June 7, 2023 Share Posted June 7, 2023 12 hours ago, shaunmccloud said: another non-correcting one and the same amount of errors It may be the same amount but they are in different sectors, also a lot of them sequentially to be a transient RAM issue, most likely culprits would be a controller or a disk, if it's a disk it can be a pain to find it, because you'd basically need to re-test after removing/replacing one disk at a time, if you have a different controller you could use I would start there. Quote Link to comment
shaunmccloud Posted June 7, 2023 Author Share Posted June 7, 2023 7 hours ago, JorgeB said: It may be the same amount but they are in different sectors, also a lot of them sequentially to be a transient RAM issue, most likely culprits would be a controller or a disk, if it's a disk it can be a pain to find it, because you'd basically need to re-test after removing/replacing one disk at a time, if you have a different controller you could use I would start there. I can replace one of my controllers, but not both of them. I am out of PCIe ports to swap out two controllers Quote Link to comment
JorgeB Posted June 7, 2023 Share Posted June 7, 2023 Try just one, if the same swap with the other, note that you must run at least two checks to confirm if the issue is still there or not, 1st might still find errors, 2nd one can't. Quote Link to comment
shaunmccloud Posted June 7, 2023 Author Share Posted June 7, 2023 (edited) 19 minutes ago, JorgeB said: Try just one, if the same swap with the other, note that you must run at least two checks to confirm if the issue is still there or not, 1st might still find errors, 2nd one can't. Should have said, one of my SAS controllers is onboard. So to be able to replace both of them I would have to get a 16 port controller. But I will replace the one I can with my cross flashed H330 (IT mode) as a start. If I have to, I will pick up a new 16 port HBA and new cables (since most 16 port I can find on eBay use the newer connector type). Edited June 7, 2023 by shaunmccloud Quote Link to comment
shaunmccloud Posted June 11, 2023 Author Share Posted June 11, 2023 @JorgeBSince I am not good at reading the logs, which drives had the issues? I have 4 on one controller and 8 on another one right now. Knowing which drives would help me narrow down which controller. Quote Link to comment
JorgeB Posted June 11, 2023 Share Posted June 11, 2023 9 hours ago, shaunmccloud said: Since I am not good at reading the logs, which drives had the issues? That's unfortunately not possible to know, why I mentioned that if it's a disk basically it will be a pain, because you'd need to remove one disk at a time and re-test until you find the culprit, like this guy had to do a while back: Quote Link to comment
shaunmccloud Posted June 12, 2023 Author Share Posted June 12, 2023 Swapped my addon HBA and ran a correcting parity check. The email I got showed 490 errors, but the UI showed 0. And looking at the log I do not see any issues like I did before. New diagnostics attached. bb-8-diagnostics-20230612-0702.zip Quote Link to comment
JorgeB Posted June 12, 2023 Share Posted June 12, 2023 11 minutes ago, shaunmccloud said: The email I got showed 490 errors There's a known issue with the notifications, thought not clear what causes it since it's not reproducible, the log does show 0 errors, if the next one is also 0 I would consider yhe issue solved. Quote Link to comment
shaunmccloud Posted June 12, 2023 Author Share Posted June 12, 2023 Ok, I do have a non-correcting running right now. With the old controller, the parity check error count matched between the email & the UI. Quote Link to comment
JorgeB Posted June 12, 2023 Share Posted June 12, 2023 3 minutes ago, shaunmccloud said: With the old controller, the parity check error count matched between the email & the UI. Pretty sure it's not the controller, that issue happens sometimes only. Quote Link to comment
shaunmccloud Posted June 12, 2023 Author Share Posted June 12, 2023 1 minute ago, JorgeB said: Pretty sure it's not the controller, that issue happens sometimes only. Meant I think the old controller was bad since the two counts matched. But with the new controller it does not match. Quote Link to comment
JorgeB Posted June 12, 2023 Share Posted June 12, 2023 Agreed, if both checks complete without errors it could have been a bad controller. Quote Link to comment
shaunmccloud Posted June 12, 2023 Author Share Posted June 12, 2023 And it might actually be one of my 4TB disks. Event: Unraid Disk 8 error Subject: Alert [BB-8] - Disk 8 in error state (disk dsbl) Description: ST4000VN000-1H4168_Z301MZHT (sdh) Importance: alert Quote Link to comment
JorgeB Posted June 13, 2023 Share Posted June 13, 2023 That should not have caused sync errors, especially before it got disabled, but it's a possibility. Quote Link to comment
shaunmccloud Posted June 13, 2023 Author Share Posted June 13, 2023 4 hours ago, JorgeB said: That should not have caused sync errors, especially before it got disabled, but it's a possibility. It did rebuild fine, running another parity check. I should consider replacing my 4TB drives "soon" anyway. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.