disk read error - where do i go from here?


Olymoly

Recommended Posts

while at work, i received an email about a disk have a read error,

 

i pulled diagnostics, stopped docker to prevent anymore writes or reads. i ordered replacement drives, which should be here tomorrow. i didn't notice til after but there was a monthly scheduled parity that occurred after the fact.

 

my parity was listed as valid prior to this error but i doubt it was working since the disk status were disabled, i was told it was a likely a power issue, however i was unable to complete troubleshooting that parity issue. so here i am wondering how much damage has been done.

 

thanks for your help.

 

nastradamus-diagnostics-20210901-0546.zip

Link to comment

If a disk is disabled then it is ignored by a parity check.   In addition if the number of disabled drives is as large as your number of parity drives then the check becomes merely a read check of the remaining drives.

 

Do you have the scheduled check set to be correcting or non-correcting?   Non-correcting is recommended so that if you have a drive acting up it does not run the risk of corrupting parity.   The main purpose of the chevk is to alert you any time you have a non-zero number of errors during the run to the fact that you may have a problem that needs further investigation.

Link to comment

 

 

On 9/2/2021 at 2:51 AM, itimpi said:

If a disk is disabled then it is ignored by a parity check.   In addition if the number of disabled drives is as large as your number of parity drives then the check becomes merely a read check of the remaining drives.

 

Thanks for your help, that is good to know, Afaik The only disks that are disable are the parity disks, i have been unable to figure out why or how to fix this.  I've tried changing drive and switching slots. I have dual 920w PSUs in my supermicro case, i dont see how it was recommended to be a power issue and i havent been able to find reasonable replacements.

 

On 9/2/2021 at 2:51 AM, itimpi said:

Do you have the scheduled check set to be correcting or non-correcting?

 

The Scheduler seems to be set to be correcting. i disable it but it also set to yes. afaik i was enabled.

 

 

Ive also see that my Log is 100% at 188 GBs, i would like to restart the server but dont want to cause any more damage. Until i know what is going on with my parity, i trying to be cautious.

 

Considering my next step is to reboot? Maintenance mode and a smart check of the offending drive?

 

 

Link to comment

rebooted and here is the most latest diagnostics

 

it takes about 20 mins to power down and then another 20 mins for disks to show up, to assign and start the array.

 

started in maintenance mode and running extended smart check on disk 6.

 

i am most worry about my data on disk 6, as my parity issue, seems to me to be a lost cause and i will have start over for me to get it to work. i plan to this when i have 2 week vacation at the end of the month.

 

i also received 2 drives that i will begin to preclear tonight. still havent decided how i want to handle them. however will decide in time.

 

first boot nastradamus-diagnostics-20210905-1201.zip

Link to comment
On 9/3/2021 at 2:51 PM, Olymoly said:

could this be because the reason

Your log space was full before reboot, and shortly after reboot your log space is nearly 10% full even though syslog isn't very large. I don't quite remember exactly what shows up in diagnostics for your old version of Unraid, but there also doesn't seem to be any logs from docker or vm. You have those enabled but do you actually have any dockers or VMs?

 

My usual suspect for filling log would be atop from NerdPack, but you don't seem to have that either.

 

What do you get from command line with this?

du -h -d 1 /var/log

 

Link to comment

I don't have VM, I have Dockers, however the array was started maintenance mode to run smart.

Smart.came.bacl.no errosnom the disk, I have nothing missing at a glance, I guess that reboot went a long way and fixed the immediate issue.

I'll post new diagnostics when I have a chance, since I'm at work atm

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.