technorati Posted May 9, 2020 Posted May 9, 2020 I have pulled two drives from a system that had some bad/cheap SATA cables in it, and added them to my unRIAD. When I pulled the drives, the SATA cables were obviously coming apart where they mount to the drive, so I was not surprised to find that unRAID alerted me these drives had positive values in "UDMA CRC error count" (12 on one drive, 4 on the other). As I'm pretty confident this was due to the bad linkage they previously had, I'd like to just acknowledge this error, unless those values start increasing all of a sudden. But it seems every time I stop / start the array, I have to re-acknowledge those errors. Is there any way to permanently let the system know "Yep - 12 and 4 are acceptable values for those two drives"? Quote
remotevisitor Posted May 9, 2020 Posted May 9, 2020 (edited) Click on the SMART status entry on the dashboard for the disk in question and select the Acknowledge option. CRC entries are never actually reset back to zero, but if the value changes again you will be warned. if you are already doing this, then you may have some issue saving the saved state back to the flash drive. Edited May 9, 2020 by remotevisitor Quote
technorati Posted May 9, 2020 Author Posted May 9, 2020 Yes, that's what I'm doing, but it seems to come back and alert me again every time I stop the array or reboot the machine. Quote
trurl Posted May 9, 2020 Posted May 9, 2020 14 minutes ago, technorati said: that's what I'm doing 21 minutes ago, remotevisitor said: then you may have some issue saving the saved state back to the flash drive. Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post. Quote
technorati Posted May 9, 2020 Author Posted May 9, 2020 Other settings seem to stick OK, but here's my Diagnostics. juggernaut-diagnostics-20200509-1111.zip Quote
trurl Posted May 9, 2020 Posted May 9, 2020 Syslogs are spammed with May 8 12:43:29 juggernaut root: error: /plugins/unassigned.devices/UnassignedDevices.php: wrong csrf_token Quote
technorati Posted May 10, 2020 Author Posted May 10, 2020 Based on that FAQ, I believe this is because there was another computer on the network that still had the unRAID web UI open across multiple reboots. It wasn't clear to me, though, whether you're saying this is the reason that it's not saving the SMART acks? Thanks for looking into this! Quote
trurl Posted May 10, 2020 Posted May 10, 2020 35 minutes ago, technorati said: It wasn't clear to me, though, whether you're saying this is the reason that it's not saving the SMART acks? So have you tried it again after fixing that problem? Quote
technorati Posted May 11, 2020 Author Posted May 11, 2020 (edited) On 5/9/2020 at 7:23 PM, trurl said: So have you tried it again after fixing that problem? Yes, I am still seeing the SMART status come back to error again intermittently: (I just confirmed there have been no csrf_token errors in my logs since my last reboot) Edited May 11, 2020 by technorati add more detail Quote
trurl Posted May 11, 2020 Posted May 11, 2020 I have one disk with 2 CRC errors. I acknowledged them long ago by clicking on the "thumbs down" on the Dashboard. Since they haven't increased I don't get any further warnings. Is that what you have done to acknowledge them? Have they increased? Quote
technorati Posted May 11, 2020 Author Posted May 11, 2020 (edited) 15 minutes ago, trurl said: I have one disk with 2 CRC errors. I acknowledged them long ago by clicking on the "thumbs down" on the Dashboard. Since they haven't increased I don't get any further warnings. Is that what you have done to acknowledge them? Have they increased? Yes, I clicked the "Thumbs down" icon and chose "Acknowledge" from the menu. The values (UDMA CRC Count) have not increased on the drives, but certain events (stopping the array, rebooting the server) sometimes cause unRAID to alert me again on the same value in the same attribute. No other attribute on either drive shows any indicators of an alert. Edited May 11, 2020 by technorati Quote
trurl Posted May 11, 2020 Posted May 11, 2020 9 minutes ago, technorati said: No other attribute on either drive shows any indicators of an alert. From Main, click on a drive to get to its page, then go to the Attributes section. Do any of the other attributes have yellow highlight? I am sure the acknowledged count must be stored on flash somewhere but I don't know the details. Since it isn't happening all the time I wonder if you don't have an intermittent flash problem. When it happens again post new diagnostics. Quote
technorati Posted May 11, 2020 Author Posted May 11, 2020 3 minutes ago, trurl said: From Main, click on a drive to get to its page, then go to the Attributes section. Do any of the other attributes have yellow highlight? No - that's exactly what I meant by "no other attribute on either drive shows any indicators of an alert" - the only field in yellow on either drive is that UDMA CRC Count. juggernaut-diagnostics-20200511-1001.zip Attached is an updated diagnostics.zip from when I had the error this morning. Quote
trurl Posted May 11, 2020 Posted May 11, 2020 2 hours ago, technorati said: No - that's exactly what I meant by "no other attribute on either drive shows any indicators of an alert" - the only field in yellow on either drive is that UDMA CRC Count. The yellow highlight will not go away when you access the drives Attributes. It is the Dashboard SMART warning icon and associated Notifications that are supposed to no longer happen once you have acknowledged, until the count increases. Are you still getting those? Nothing obvious in diagnostics that makes me think there is a Flash problem. Has Fix Common Problems ever told you there was a Flash problem? Not ideal, but you can configure each disk regarding which SMART attributes get monitored by clicking on the disk to get to its page, then go to SMART Settings and uncheck the box for the attribute. Of course that means it will never check that attribute again, though you can always take a look at it yourself in the Attributes section as before. Quote
technorati Posted May 11, 2020 Author Posted May 11, 2020 3 hours ago, trurl said: The yellow highlight will not go away when you access the drives Attributes. It is the Dashboard SMART warning icon and associated Notifications that are supposed to no longer happen once you have acknowledged, until the count increases. Are you still getting those? Yes, I'm still getting them - there's a screenshot a few posts above here showing that it came back again this morning. Quote Nothing obvious in diagnostics that makes me think there is a Flash problem. Has Fix Common Problems ever told you there was a Flash problem? No, I just ran "Fix Common Problems" and it doesn't report any issues with flash; nor have I ever seen anything in the logs or other notifications suggesting there was. Quote Not ideal, but you can configure each disk regarding which SMART attributes get monitored by clicking on the disk to get to its page, then go to SMART Settings and uncheck the box for the attribute. Of course that means it will never check that attribute again, though you can always take a look at it yourself in the Attributes section as before. I'd prefer to live with it, so at least I get notified if the count does increase, indicating that a larger problem exists. Mostly, I was just curious why it would keep notifying me of something I'd already acknowledged, but it sounds like there's not a ready/obvious answer to that. Quote
JorgeB Posted May 13, 2020 Posted May 13, 2020 Acknowledge warnings, grab diags and screenshot of the dashboard, then reboot and more errors appear do the same, then post them all. Quote
Falcosc Posted May 20, 2022 Posted May 20, 2022 (edited) I was not sure about the state of my Disk because I did not remember if I hit acknowledge half a year ago, and it did degrade since then or if I did not hit acknowledge in the past. Because the UI does not tell you the stored values from the last acknowledgement, I did search for the config file. This topic was listed in the Google search, so I will provide the answer here: /boot/config/plugins/dynamix/monitor.ini this file tells you the old values and helps you to decide if your disk did degrade significantly or just had a hick up. I recommend making a copy of this file with the current timestamp because It does not tell you at which date these values got recorded and the extended smart error log does only save errors. Edited May 20, 2022 by Falcosc Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.