"Device is Disabled, Contents emulated" Write failed?


Cazzy

Recommended Posts

So I was half drunk the other day when I got an email that my drive failed a write, for some reason lol. I ran the short and extended SMART tests and both came back clean. I saw posts from previous users where they suggested to take it offline, remove from array, re-add, rebuild but I want to make sure that's my only route here? Any reason, based on the files attached, to believe that removing the drive from the array and resyncing it won't work? What do you suggest, otherwise? 

 

Diagnostics and Smart Test are attached. 

 

I don't have a drive to replace it with yet since Best Buy doesn't have them on sale right now (I know, I know. I should've gotten backup drives when they were on sale. Lesson learned, lol). 

 

Thanks in advance! 

plexhub-diagnostics-20201001-1051.zip plexhub-smart-20201001-1050.zip

Edited by Cazzy
Link to comment

SMART looks OK, but syslog seems to indicate these may be actual disk problems. Which controller is this disk on?

 

Since it passed extended test, you can try to rebuild to the same disk and see if it works.

 

Not necessary to remove, just

  1. Stop array
  2. Unassign disabled disk
  3. Start array with disabled disk unassigned
  4. Stop array
  5. Reassign disabled disk
  6. Start array to begin rebuild of disabled disk

On Main, you should see Writes to rebuilding disk, Reads from parity and all other array disks, nothing in the Errors column.

 

If there are problems post new diagnostics.

  • Like 1
Link to comment

Looks like your LSI card is resetting - may need to check cooling/ reseat it in the slot.  Also, not sure if it's related, but your're running an older version of the firmware (20.00.04.00).  You should look into updating that.

28 minutes ago, Cazzy said:

Any reason, based on the files attached, to believe that removing the drive from the array and resyncing it won't work? What do you suggest, otherwise? 

does the emulated drive appear mountable?

  • Like 1
Link to comment
7 hours ago, civic95man said:

Looks like your LSI card is resetting - may need to check cooling/ reseat it in the slot.  Also, not sure if it's related, but your're running an older version of the firmware (20.00.04.00).  You should look into updating that.

does the emulated drive appear mountable?

 

I appreciate it! I just flashed it with 20.00.07.00 and confirmed it flashed correctly. Yes, it did appear as mountable.

 

How were you able to tell the card was resetting, exactly, just for my knowledge? Would like to learn as I go along!

 

Thanks, man!

 

7 hours ago, trurl said:

SMART looks OK, but syslog seems to indicate these may be actual disk problems. Which controller is this disk on?

 

Since it passed extended test, you can try to rebuild to the same disk and see if it works.

 

Not necessary to remove, just

  1. Stop array
  2. Unassign disabled disk
  3. Start array with disabled disk unassigned
  4. Stop array
  5. Reassign disabled disk
  6. Start array to begin rebuild of disabled disk

On Main, you should see Writes to rebuilding disk, Reads from parity and all other array disks, nothing in the Errors column.

 

If there are problems post new diagnostics.

Doing this now and it looks to be rebuilding. I'll update if any issues come up once it's done! Thank you for the help so far!

Edited by Cazzy
Link to comment
14 hours ago, Cazzy said:

How were you able to tell the card was resetting, exactly, just for my knowledge? Would like to learn as I go along!

In your syslog towards the end, its filled with this

Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290504
Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290512
Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290520
Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290528
Sep 27 16:57:11 PlexHub kernel: md: disk1 write error, sector=13050290536

That looks like a failing disk for one reason or another, but further back in the log, you see this

Sep 27 16:56:08 PlexHub kernel: mpt2sas_cm0: fault_state(0x7e23)!
Sep 27 16:56:08 PlexHub kernel: mpt2sas_cm0: sending diag reset !!
Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: diag reset: SUCCESS
Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k
Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.04.00), ChipRevision(0x03), BiosVersion(07.39.00.00)
Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: Protocol=(
Sep 27 16:56:09 PlexHub kernel: Initiator
Sep 27 16:56:09 PlexHub kernel: ,Target
Sep 27 16:56:09 PlexHub kernel: ), 
Sep 27 16:56:09 PlexHub kernel: Capabilities=(
Sep 27 16:56:09 PlexHub kernel: TLR
Sep 27 16:56:09 PlexHub kernel: ,EEDP
Sep 27 16:56:09 PlexHub kernel: ,Snapshot Buffer
Sep 27 16:56:09 PlexHub kernel: ,Diag Trace Buffer
Sep 27 16:56:09 PlexHub kernel: ,Task Set Full
Sep 27 16:56:09 PlexHub kernel: ,NCQ
Sep 27 16:56:09 PlexHub kernel: )
Sep 27 16:56:09 PlexHub kernel: mpt2sas_cm0: sending port enable !!
Sep 27 16:56:16 PlexHub kernel: mpt2sas_cm0: port enable: SUCCESS
Sep 27 16:56:16 PlexHub kernel: mpt2sas_cm0: search for end-devices: start

-where mpt2sas is the driver for the LSI card. This is where the card s--t the bed and where all of the disk problems originated, probably due to the outdated firmware.

 

14 hours ago, Cazzy said:

Yes, it did appear as mountable.

And I asked because sometimes when a disk becomes disabled, the emulated disk appears unmountable.  This can generally be fixed with a file system repair without losing much, if any, data. But in some extreme cases where it cannot be repaired or the recovered data is a mess, the contents of the physical drive may be a better choice. But really it all comes down to a case by case basis.

Edited by civic95man
Link to comment
On 10/2/2020 at 10:22 AM, civic95man said:

-where mpt2sas is the driver for the LSI card. This is where the card s--t the bed and where all of the disk problems originated, probably due to the outdated firmware.

Looks like it was the firmware. After re-flashing, the Drive was rebuilt. I checked the logs and those errors are gone. Ran a SMART test again, just to be safe. It passed. 

 

I think we're good now! (Fingers crossed)

 

Thank you for your insight and helping me out! @trurl as well! 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.