[Solved] Red X — Disk Disabled, Content Emulated — After Sudden Power Outage/Cache Install


Recommended Posts

as the title suggests, unraid server had a sudden power cut due to house's electricity being down. server was running on the last trial extension and a key was acquired to start the array back on. but, before doing so, an ssd cache drive was installed and then, upon powering on, included into the array. key was installed, and after starting the array, all disks where green/active, but as unraid checks all drives after sudden shutdowns, while reaching around 24%, disk 1 got disabled w/ emulated content. it refuses to do either short or extended SMART test indicating Interrupted (host reset). check finished with 678 errors on disk 1, and 0 errors on all other 4 disks.

 

the is my first disk disabled encounter. need help in identifying the issue, and also what would be the next step? attached below are the diagnostic files, last smart report, with screenshots for notification error and smart report for disabled disk.

(note: that udma crc error count notification is upon all other disks as well. it existed since setting up unraid few months back while operations where running smoothly.)

 

cheers

 

2080246888_ScreenShot2019-06-18at9_10_41PM.png.7e65f96a525574898ca5da3ad0752938.png

 

1687035685_ScreenShot2019-06-19at5_58_41AM.thumb.png.4fdb8f6d186a30b436aad5d545833301.png

tower-diagnostics-20190619-0302.zip tower-smart-20190619-0602.zip

Edited by iilied
Link to comment

Your syslog shows lots of connection problems with disk1.     These are normally (although not always) caused by bad cabling (SATA or power) to the drive.   You should power down and carefully check the cabling to that drive and make sure it is properly seated.    The SATA report looks OK so hopefully it is just the cabling to the drive.

 

CRC errors are caused by connection problems.    Once they have happened they never get reset to 0.     If you click on the disk’s icon on the Dashboard tab there is an option to Acknowlege the current count and Unraid will then only tell you again if it increases.   Occasional CEC errors are not a problem but if they are occurring regularly then you have an issue that needs sorting (typically cable related).

 

The disk being disabled means a write to it has failed.   When that happens Unraid will mark it as disabled and stop writing to it, instead emulating its contents from the combination of the other data disks plus the parity drive so you can still access its contents as if it were present.   At this point you are no longer protected against another drive having problems so you want to clear the disabled state.  Hopefully the write failure is due to the connection issues and the drive itself is OK.    Because a write failed it no longer has up-to-date data.  To clear a disabled state you have to rebuild a drive from the emulated contents,

 

ideally you would do this to a spare/replacement drive and then check the drive that got disabled offline.     However if you do not have a spare/replacement drive and you think the one that got disabled is OK and the issue was caused by cabling (and you have corrected this) then you can rebuild to the same drive using the following steps:

  • Stop the array
  • unassign the disabled drive
  • start the array.    UnRaid will warn you that the contents of the missing drive are being emulated.    This step makes Unraid ‘forget’ the current assignment.   
  • stop the array
  • reassign the drive
  • start the array and Unraid will start rebuilding the drive’s contents from the emulated content.   You can access the drive normally during this process as Unraid will use the emulated contents while rebuilding.

As long as the rebuild finishes without error you are good.    If it fails (or any other issue arises) check back for advice and provide new diagnostics.

 

BTW:  The diagnostics include the SATA reports for all drives so no need to supply them separately.

  • Upvote 1
Link to comment

shuffled/reseated sata and power cables, then rebuilt on same drive after passing an extended smart test, and all went well.

 

1706773090_ScreenShot2019-06-20at7_06_18AM.png.1fa6bf42697f8a36bc0de105e3020c53.png

 

On 6/19/2019 at 7:18 AM, itimpi said:

CRC errors are caused by connection problems.    Once they have happened they never get reset to 0.     If you click on the disk’s icon on the Dashboard tab there is an option to Acknowlege the current count and Unraid will then only tell you again if it increases.   Occasional CEC errors are not a problem but if they are occurring regularly then you have an issue that needs sorting (typically cable related).

 

yet, this issue still persists. tried new sata cables, et al. still received a new crc count which jumped from 1 to 3 on disk1. currently unsure about the perpetrator, and would be appreciative if help is offered on how to go about fixing it.

 

cheers

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.