Jump to content
We're Hiring! Full Stack Developer ×

Disks 1 and 6 disabled, contents emulated, after reboot


Recommended Posts

The disk used and free numbers weren't updating in the unRAID webUI so I rebooted and saw updated used and free numbers but also that Disks 1 and 6 had become disabled, contents emulated.  Did a short SMART test on each and both passed.  What is next step? Long SMART test?  Last three diagnostics (2 from today after 2 reboots and 1 from 9 days ago) attached.

server2018-diagnostics-20240112-1230.zip server2018-diagnostics-20240121-1630.zip server2018-diagnostics-20240121-1656.zip

Edited by oh-tomo
Link to comment

Hardware details which might not appear in Diagnostics

 

 

I checked SATA connections to iStarUSA BPN-DE350SS 3 X 5.25-Inch to 5 X 3.5-Inch SAS/SATA Trayless Hot-Swap Cage (Black) and all five are secure and not loose at all. The five are from an eight cable (CableCreation Internal HD Mini SAS (SFF-8643 Host) - 4X SATA (Target) Cable,SFF-8643 to 4X SATA Cable,1.6ft, SFF-8643 for Controller, 4 Sata Connect to Hard Drive) output from a LSI 9300-8i PCI-Express 3.0 SATA / SAS 8-Port SAS3 12Gb/s HBA - Single--Avago Technologies which is connected to the motherboard listed in the Diagnostics and which is running the latest BIOS.

Link to comment

Those first diagnostics were shortly after reboot, without the array started, so nothing was mounted, so no chance for anything to get disabled yet.

 

The next diagnostics were shortly after reboot so don't tell us what happened to disable the disks. But the array is started, and all disks mount including the emulated/disabled disks.

 

The last diagnostics were shortly after reboot, and without the array started, so don't even tell us if the disk are mountable or not.

 

SMART for the disabled disks looks OK but neither has completed a recent extended self-test. Couldn't hurt to do that, 16TB will take a very long time, 10TB not quite as long but still long. You can do both at the same time.

 

While technically you can continue to use your server, you currently have no redundancy.

 

You don't mention power to the disks. Did you also check that? Any splitters involved?

 

 

Link to comment
1 hour ago, trurl said:

Those first diagnostics were shortly after reboot, without the array started, so nothing was mounted, so no chance for anything to get disabled yet.

 

The next diagnostics were shortly after reboot so don't tell us what happened to disable the disks. But the array is started, and all disks mount including the emulated/disabled disks.

 

The last diagnostics were shortly after reboot, and without the array started, so don't even tell us if the disk are mountable or not.

 

SMART for the disabled disks looks OK but neither has completed a recent extended self-test. Couldn't hurt to do that, 16TB will take a very long time, 10TB not quite as long but still long. You can do both at the same time.

 

While technically you can continue to use your server, you currently have no redundancy.

 

You don't mention power to the disks. Did you also check that? Any splitters involved?

 

 

 

Yes there is a molex to x2 SATA power splitter.  The 5-bay drive cage has two SATA power inputs so it uses both from the splitter.  And between those there was a molex-to-molex with a smaller cable powering a fan.  Stupidly just now I examined this setup while the unRAID was running and the other 3 drives in the drive cage had read errors, so I shut down.  Now that it's shut down I've been examining the power connections and removed that molex-to-molex from the SATA power chain.

 

IMG_7454-COLLAGE.jpg

Edited by oh-tomo
Link to comment
20 hours ago, trurl said:

Those first diagnostics were shortly after reboot, without the array started, so nothing was mounted, so no chance for anything to get disabled yet.

 

The next diagnostics were shortly after reboot so don't tell us what happened to disable the disks. But the array is started, and all disks mount including the emulated/disabled disks.

 

The last diagnostics were shortly after reboot, and without the array started, so don't even tell us if the disk are mountable or not.

 

SMART for the disabled disks looks OK but neither has completed a recent extended self-test. Couldn't hurt to do that, 16TB will take a very long time, 10TB not quite as long but still long. You can do both at the same time.

 

While technically you can continue to use your server, you currently have no redundancy.

 

You don't mention power to the disks. Did you also check that? Any splitters involved?

 

 

 

10TB extended test completed without errors.   16TB in progress 80% complete.  

 

Can rebuilding of the 10TB start while 16TB test is running?   If I wait until 16TB self-test is done, is rebuild of both at once possible/advisable or is one at a time the preferred method?

ST10000VN0004-1ZD101_ZA28Y9NQ-2021-04-07 disk6 (sdg) - DISK_DSBL.txt ST16000NE000-2RW103_ZL2P6V5T-2021-04-07 disk1 (sdj) - DISK_DSBL.txt

Link to comment

Wait until other disk completes self-test.

 

You could rebuild both at once, but I'm concerned that we haven't gotten to the bottom of why these disks became disabled in the first place.

 

I guess rebuild will "test" whether or not rebuild will have problems.

 

After self-test completes, post new diagnostics so we can check if there are any indications of problems.

 

 

 

 

  • Like 1
Link to comment

That all looks OK.

 

I assume you want to rebuild onto the same disks since there is nothing wrong with them. And if the connection problem or whatever happens again, we can try to fix that and start rebuild over.

 

Doing both at once won't be much different than rebuilding one at a time, except faster of course. All disks except the disabled disks will be read to get the emulated data to write to the rebuilding disk(s).

 

https://docs.unraid.net/unraid-os/manual/storage-management/#rebuilding-a-drive-onto-itself

 

I don't expect any serious problem, but I always ask this. Do you have another copy of anything important and irreplaceable?

 

 

  • Like 1
Link to comment
On 1/22/2024 at 8:34 PM, trurl said:

Do you have another copy of anything important and irreplaceable?

 

 

Event: Unraid Data-Rebuild
Subject: Notice [SERVER2018] - Data-Rebuild finished (0 errors)
Description: Duration: 1 day, 4 hours, 10 minutes, 16 seconds. Average speed: 157.8 MB/s

 

It finally finished. I wasn't sure if the smaller drive would finish and re-enable first but they both remained disabled until the 16TB finished also. Diagnostic attached.

 

On 1/22/2024 at 8:34 PM, trurl said:

Do you have another copy of anything important and irreplaceable?

 

 

I was Resilio-syncing a big folder of unRAID stuff to a 10TB external USD HDD whose partition vanished but I haven't gotten around to seeing if that's recoverable yet.

 

 

server2018-diagnostics-20240124-0057.zip

Link to comment

That all looks good.

 

7 hours ago, oh-tomo said:

I was Resilio-syncing a big folder of unRAID stuff to a 10TB external USD HDD whose partition vanished

 

You mean you don't know if you have another copy of anything important and irreplaceable? Parity is not a substitute for backup.

 

You must always have another copy of anything important and irreplaceable. You get to decide what qualifies.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...