"DEVICE IS DISABLED, CONTENTS EMULATED" but SMART Full Test Passes????


Flick

Recommended Posts

Will leave to Johnnie to look at the logs, but just wanted to reply on how a perfectly good disk can drop offline.

 

There are basically two common scenarios:

1 - The disk has a cabling issue, and under load the drive looses its connection to the controller. Once this happens the disk is inaccessible. Often a reboot will restore the connection. An extended test DOES NOT require contact with the controller (only to kick it off), so is not a good way to detect cabling issues. But if you start to see CRC errors on the SMART report, that is a very good indicator of cabling problems

 

2 - There are some controllers (Marvell) that will cause drives to drop offline, with no cabling problem. It is recommended to replace with LSI controllers (e.g., SAS9201-8i). Some users don't have issues while others do.

Link to comment

Without a pre-reboot syslog we can't see what happened, but the disk looks healthy so you'll need to rebuild, either using a new disk or the old disk, since you have dual parity it's not so risky to use the old one, just make sure that the contents of the emulated disk look correct, since whatever's there is what's going to be on the rebuilt disk, also I would recommend swapping/replacing cables/backplane slot just to rule that out in case the same disk fails again.

Link to comment

And I would just add another thing to consider for future support.

 

Don't hijack another user's support thread. Even if you have what you think is the same problem. Even if it really is the same problem. The details of your system will be different, and the details of the solution may turn out to be different. There is no good reason to create confusion by mixing up questions and answers for different users, especially when data is on the line.

 

/badcop

Link to comment
2 hours ago, trurl said:

And I would just add another thing to consider for future support.

 

Don't hijack another user's support thread. Even if you have what you think is the same problem. Even if it really is the same problem. The details of your system will be different, and the details of the solution may turn out to be different. There is no good reason to create confusion by mixing up questions and answers for different users, especially when data is on the line.

 

/badcop

 

Um, dude? This is *my* thread.

Link to comment
2 hours ago, johnnie.black said:

Without a pre-reboot syslog we can't see what happened, but the disk looks healthy so you'll need to rebuild, either using a new disk or the old disk, since you have dual parity it's not so risky to use the old one, just make sure that the contents of the emulated disk look correct, since whatever's there is what's going to be on the rebuilt disk, also I would recommend swapping/replacing cables/backplane slot just to rule that out in case the same disk fails again.

 

Yeah, I was afraid of that. Normally I'm pretty good at working through all the steps but I simply missed doing so this this time. I appreciate the feedback. I've never had to rebuild an existing disk; could you point me towards instructions to do so? I've only put in new disks in the past but, in this case, the drive has only been going for three months so I'd like to give it another shot.

 

What would be ideal, actually, would be if I could put it into a different slot. That way, should it fail again, it points to disk rather than cables/connections. Thoughts? I did go in and clean all cables, reseat connections, etc. This has been a rock solid system for several years now.

 

2 hours ago, SSD said:

Will leave to Johnnie to look at the logs, but just wanted to reply on how a perfectly good disk can drop offline.

 

There are basically two common scenarios:

1 - The disk has a cabling issue, and under load the drive looses its connection to the controller. Once this happens the disk is inaccessible. Often a reboot will restore the connection. An extended test DOES NOT require contact with the controller (only to kick it off), so is not a good way to detect cabling issues. But if you start to see CRC errors on the SMART report, that is a very good indicator of cabling problems

 

2 - There are some controllers (Marvell) that will cause drives to drop offline, with no cabling problem. It is recommended to replace with LSI controllers (e.g., SAS9201-8i). Some users don't have issues while others do.

 

1. Ah, I did not know that about the SMART kick off. TIL. Thanks!

 

2. Shouldn't be an issue, this has been a controller in play for many years with no issues.

 

Thank you for the assistance. Love this community!

Link to comment
4 minutes ago, Flick said:

I've never had to rebuild an existing disk; could you point me towards instructions to do so?

https://lime-technology.com/wiki/Troubleshooting#Re-enable_the_drive

 

4 minutes ago, Flick said:

What would be ideal, actually, would be if I could put it into a different slot. That way, should it fail again, it points to disk rather than cables/connections. Thoughts? I did go in and clean all cables, reseat connections, etc. This has been a rock solid system for several years now.

Exactly, so if it fails again there could really be an issue with the disk, SMART is a good indication but an healthy SMART doesn't always equal an healthy disk.

Link to comment
1 hour ago, trurl said:

Of course it is. Did you not see that another user had  posted their diagnostics on your thread?

Yup, I did and was ignoring it. Since you didn't quote, I assumed you'd gotten us mixed up as I was the last "non-support person" to post in the thread. No worries, mate, I appreciate you helping to keep the forums clean!

 

1 hour ago, johnnie.black said:

https://lime-technology.com/wiki/Troubleshooting#Re-enable_the_drive

 

Exactly, so if it fails again there could really be an issue with the disk, SMART is a good indication but an healthy SMART doesn't always equal an healthy disk.

 

Perfect, I'll give this a go this evening. Thank you!

Link to comment
  • 2 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.