Flick Posted January 22, 2018 Share Posted January 22, 2018 Any ideas? This is rather confusing to me, that it's offline in Unraid but passes the SMART extended self-test. Attached is the full report. Any help is appreciated! Thank you.BadDrive.zip Quote Link to comment
sureguy Posted January 22, 2018 Share Posted January 22, 2018 This generally means a write to the disk failed and it needs to be rebuilt. Quote Link to comment
JorgeB Posted January 22, 2018 Share Posted January 22, 2018 Please post the complete diagnostics, ideally after the disk was disabled and before rebooting: Tools - > Diagnostics Quote Link to comment
JorgeB Posted January 22, 2018 Share Posted January 22, 2018 1 minute ago, francrouge said: Here you go thx nas-diagnostics-20180121-1918.zip That was to the OP, already responded to you on your thread. Quote Link to comment
Flick Posted January 22, 2018 Author Share Posted January 22, 2018 (edited) Unraid v6.4. Attached are the diagnostics as requested, although I've already done a reboot since the error first popped up, I'm afraid. I shut down to check cables and connections. Thank you! library-diagnostics-20180122-0755.zip Edited January 22, 2018 by Flick Quote Link to comment
SSD Posted January 22, 2018 Share Posted January 22, 2018 Will leave to Johnnie to look at the logs, but just wanted to reply on how a perfectly good disk can drop offline. There are basically two common scenarios: 1 - The disk has a cabling issue, and under load the drive looses its connection to the controller. Once this happens the disk is inaccessible. Often a reboot will restore the connection. An extended test DOES NOT require contact with the controller (only to kick it off), so is not a good way to detect cabling issues. But if you start to see CRC errors on the SMART report, that is a very good indicator of cabling problems 2 - There are some controllers (Marvell) that will cause drives to drop offline, with no cabling problem. It is recommended to replace with LSI controllers (e.g., SAS9201-8i). Some users don't have issues while others do. Quote Link to comment
JorgeB Posted January 22, 2018 Share Posted January 22, 2018 Without a pre-reboot syslog we can't see what happened, but the disk looks healthy so you'll need to rebuild, either using a new disk or the old disk, since you have dual parity it's not so risky to use the old one, just make sure that the contents of the emulated disk look correct, since whatever's there is what's going to be on the rebuilt disk, also I would recommend swapping/replacing cables/backplane slot just to rule that out in case the same disk fails again. Quote Link to comment
trurl Posted January 22, 2018 Share Posted January 22, 2018 And I would just add another thing to consider for future support. Don't hijack another user's support thread. Even if you have what you think is the same problem. Even if it really is the same problem. The details of your system will be different, and the details of the solution may turn out to be different. There is no good reason to create confusion by mixing up questions and answers for different users, especially when data is on the line. /badcop Quote Link to comment
Flick Posted January 22, 2018 Author Share Posted January 22, 2018 2 hours ago, trurl said: And I would just add another thing to consider for future support. Don't hijack another user's support thread. Even if you have what you think is the same problem. Even if it really is the same problem. The details of your system will be different, and the details of the solution may turn out to be different. There is no good reason to create confusion by mixing up questions and answers for different users, especially when data is on the line. /badcop Um, dude? This is *my* thread. Quote Link to comment
Flick Posted January 22, 2018 Author Share Posted January 22, 2018 2 hours ago, johnnie.black said: Without a pre-reboot syslog we can't see what happened, but the disk looks healthy so you'll need to rebuild, either using a new disk or the old disk, since you have dual parity it's not so risky to use the old one, just make sure that the contents of the emulated disk look correct, since whatever's there is what's going to be on the rebuilt disk, also I would recommend swapping/replacing cables/backplane slot just to rule that out in case the same disk fails again. Yeah, I was afraid of that. Normally I'm pretty good at working through all the steps but I simply missed doing so this this time. I appreciate the feedback. I've never had to rebuild an existing disk; could you point me towards instructions to do so? I've only put in new disks in the past but, in this case, the drive has only been going for three months so I'd like to give it another shot. What would be ideal, actually, would be if I could put it into a different slot. That way, should it fail again, it points to disk rather than cables/connections. Thoughts? I did go in and clean all cables, reseat connections, etc. This has been a rock solid system for several years now. 2 hours ago, SSD said: Will leave to Johnnie to look at the logs, but just wanted to reply on how a perfectly good disk can drop offline. There are basically two common scenarios: 1 - The disk has a cabling issue, and under load the drive looses its connection to the controller. Once this happens the disk is inaccessible. Often a reboot will restore the connection. An extended test DOES NOT require contact with the controller (only to kick it off), so is not a good way to detect cabling issues. But if you start to see CRC errors on the SMART report, that is a very good indicator of cabling problems 2 - There are some controllers (Marvell) that will cause drives to drop offline, with no cabling problem. It is recommended to replace with LSI controllers (e.g., SAS9201-8i). Some users don't have issues while others do. 1. Ah, I did not know that about the SMART kick off. TIL. Thanks! 2. Shouldn't be an issue, this has been a controller in play for many years with no issues. Thank you for the assistance. Love this community! Quote Link to comment
trurl Posted January 22, 2018 Share Posted January 22, 2018 11 minutes ago, Flick said: Um, dude? This is *my* thread. Of course it is. Did you not see that another user had posted their diagnostics on your thread? Quote Link to comment
JorgeB Posted January 22, 2018 Share Posted January 22, 2018 4 minutes ago, Flick said: I've never had to rebuild an existing disk; could you point me towards instructions to do so? https://lime-technology.com/wiki/Troubleshooting#Re-enable_the_drive 4 minutes ago, Flick said: What would be ideal, actually, would be if I could put it into a different slot. That way, should it fail again, it points to disk rather than cables/connections. Thoughts? I did go in and clean all cables, reseat connections, etc. This has been a rock solid system for several years now. Exactly, so if it fails again there could really be an issue with the disk, SMART is a good indication but an healthy SMART doesn't always equal an healthy disk. Quote Link to comment
Flick Posted January 22, 2018 Author Share Posted January 22, 2018 1 hour ago, trurl said: Of course it is. Did you not see that another user had posted their diagnostics on your thread? Yup, I did and was ignoring it. Since you didn't quote, I assumed you'd gotten us mixed up as I was the last "non-support person" to post in the thread. No worries, mate, I appreciate you helping to keep the forums clean! 1 hour ago, johnnie.black said: https://lime-technology.com/wiki/Troubleshooting#Re-enable_the_drive Exactly, so if it fails again there could really be an issue with the disk, SMART is a good indication but an healthy SMART doesn't always equal an healthy disk. Perfect, I'll give this a go this evening. Thank you! Quote Link to comment
Flick Posted January 25, 2018 Author Share Posted January 25, 2018 Done, looks like all went well. Fingers crossed! Event: unRAID Parity sync / Data rebuildSubject: Notice [LIBRARY] - Parity sync / Data rebuild finished (0 errors)Description: Duration: 1 day, 5 hours, 52 seconds. Average speed: 57.5 MB/sImportance: normal Quote Link to comment
tiwing Posted July 6, 2020 Share Posted July 6, 2020 (edited) Hi, I have the same issue - link above no longer works and I suck at searching - tried, no luck. Any updated link, or revised instructions for this scenario? thanks tiwing Edited July 6, 2020 by tiwing Quote Link to comment
JorgeB Posted July 6, 2020 Share Posted July 6, 2020 6 minutes ago, tiwing said: Hi, I have the same issue - link above no longer works and I suck at searching If you mean the link to re-enable a drive it's below, but make sure the drive is healthy and the emulated disk is mounting correctly and contents look correct before rebuilding on top. https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.