Brydezen Posted April 9, 2020 Share Posted April 9, 2020 (edited) Hello, when I turned on my server this morning nothing seemed wrong at first. But when I went in to check the unraid web interface it had a red X by the parity disk. And says it's disabled. Have read some other threads, saying that their device might be offline since. They could not perform any SMART tests. But mine seems to do it just fine. So I decided to run a parity check. And it just passed without any errors. But still reports as disabled. What should I do about it? Best regards, Brydezen tower-diagnostics-20200409-1950.zip Edited April 9, 2020 by Brydezen Quote Link to comment
JorgeB Posted April 9, 2020 Share Posted April 9, 2020 Diags are after rebooting so we cant see what happened, but disk look fine so you can re-enable the drive , recommend replacing/swapping cables first to rule them out if it happens again. Quote Link to comment
curtis-r Posted April 9, 2020 Share Posted April 9, 2020 (edited) A week ago I was having write-speed issues on a drive (per this thread). I replaced cables on that drive & party. Also started converting some drives to xfs. Things seemed to be ok except for the parity red 'x'. I had unassigned the parity drive, starting array, stopping, then reassigning, but it's back to red. Today ran a smart test which was good. Just executed the Trust the Parity process & the parity started rebuilding. It first gave green ball & says the parity-sync in progress, then "Parity disk in error state", then "Parity sync / Data rebuild finished (0 errors)" but "canceled" under Description. Next to the parity it lists 736 Writes and 296 Errors. The parity check is no longer progressing and have the red 'x'. ST4000DM000-1F2168_W3006CAF-20200409-1105.txt Edited April 9, 2020 by curtis-r writes & errors Quote Link to comment
JorgeB Posted April 9, 2020 Share Posted April 9, 2020 Like mentioned without diags not much we can see, try to sync again and post diags if the same happens (before rebooting). Quote Link to comment
Fffrank Posted April 9, 2020 Share Posted April 9, 2020 736 write errors means the cabling is likely bad or the drive is dying. Unraid puts the brakes on and disables that drive it it starts seeing that many errors. Quote Link to comment
trurl Posted April 9, 2020 Share Posted April 9, 2020 54 minutes ago, Fffrank said: 736 write errors means the cabling is likely bad or the drive is dying. Unraid puts the brakes on and disables that drive it it starts seeing that many errors. A single unrecoverable write error is all that is required to disable a disk, since that disk is out-of-sync with parity at that point. Quote Link to comment
curtis-r Posted April 9, 2020 Share Posted April 9, 2020 After trying New Config again, it maintained all my drive assignments, but the parity was blank and my drive was not listed as available. I rebooted (with array still stopped) & all assignments were lost, so I reassigned them as the were & started the array. Parity now has orange triangle and is running parity rebuild. Already changed the cable but if this fails, I think I have an extra SATA port, so I'll try switching the parity to that. The diagnostics I attached was after a reboot, so probably not useful. If the rebuild fails, I'll post that diag. Thanks. tower-diagnostics-20200409-1348.zip Quote Link to comment
curtis-r Posted April 10, 2020 Share Posted April 10, 2020 The parity-rebuild finished with no errors on the parity, so fingers-crossed that this sticks. 1 other drive did have some errors, so I'm not sure what that's about. Diag & SMART attached. tower-diagnostics-20200410-0806.zip tower-smart-20200410-0808.zip Quote Link to comment
curtis-r Posted April 10, 2020 Share Posted April 10, 2020 Spoke too soon. Parity failed again . Diag attached. tower-diagnostics-20200410-0824.zip Quote Link to comment
JorgeB Posted April 10, 2020 Share Posted April 10, 2020 Disk5 appears to be failing, you can confirm by running an extended SMART test or running another parity sync to see if it goes better this time, note that current parity isn't 100% valid because of the read errors on disk5 during the sync. Quote Link to comment
JorgeB Posted April 10, 2020 Share Posted April 10, 2020 Parity looks more like a connection problem, replace both cables. Quote Link to comment
curtis-r Posted April 10, 2020 Share Posted April 10, 2020 Before seeing the last posts, changed Parity SATA port only. Started New Config & rebuild. If disk5 shows errors will do the same for it. thanks. Quote Link to comment
curtis-r Posted April 11, 2020 Share Posted April 11, 2020 Before seeing the last posts, changed Parity SATA port only. Started New Config & rebuild. If disk5 shows errors will do the same for it. thanks. Parity rebuild completed but disk5 had errors (SMART attached). Tomorrow I'll replace it's cable & possible move it to a new SATA port. tower-smart-20200410-2337.zip Quote Link to comment
JorgeB Posted April 11, 2020 Share Posted April 11, 2020 2 hours ago, curtis-r said: disk5 had errors (SMART attached) SMART tests confirm disk5 is failing, new cables/port won't help. Quote Link to comment
curtis-r Posted April 11, 2020 Share Posted April 11, 2020 For my understanding, can you tell me what in the SMART tells you the drive is failing? thanks. Quote Link to comment
JorgeB Posted April 11, 2020 Share Posted April 11, 2020 Even the short SMART test is failing: "read failure" SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 1047 5661095093 # 2 Short offline Completed: read failure 90% 1033 5661095093 # 3 Short offline Completed without error 00% 50 - # 4 Extended offline Aborted by host 50% 50 - Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.