Drive Reconstruction Stalled (drive icon grey triangle?)

jeffreywhunter · May 25, 2015

Unraid 6 RC3

I have a 1TB drive that was accidently unplugged when a server was moved. It came up as failed (of course), so after some fiddling and a reboot or two I opened the case to see what the issue was. Unplugged. Easy right?

So I plugged it back in and rebooted. The drive then responded I did a smart check on it and left it to do what it does.

This morning I checked to see what's going on and the drive is still active, but...the drive icon is a grey triangle. That state is not on the list!

Short and extended self-tests ran with no problems.

http://my.jetscreenshot.com/12412/20150525-vfqk-47kb.jpg

Screenshot of Main

http://my.jetscreenshot.com/12412/20150524-oigx-94kb.jpg

http://my.jetscreenshot.com/12412/20150525-8eru-61kb.jpg

Screenshot of unmenu - disk array status - disk_dsbl?

http://my.jetscreenshot.com/12412/20150524-4pae-150kb.jpg

I've run Smart check a couple times, no errors. No errors reported.

Smart report, disk attributes & syslog zip attached.

Thanks in advance guys, for your wisdom and experience!

Disk_Failure.zip

Squid · May 25, 2015

In order to get unRaid to rebuild the drive back onto itself, you've got to stop the array. Change disk6 to be not installed. Restart the array. Stop the array, change disk6 back to what its supposed to be. When you start the array, it will begin to rebuild the drive.

jeffreywhunter · May 25, 2015

Thanks Squid, "content being reconstructed"... Is this the same process you'd use to upgrade a disk? i.e. if I wanted to replace the 1TB with a 3TB or whatever?

jeffreywhunter · May 25, 2015

And might I also say, Squid, thanks for not embarrassing me by identifying I pulled the Smart report on the wrong drive! Duh... Need to get more sleep!

#youguysaregreat!

Squid · May 25, 2015

Thanks Squid, "content being reconstructed"... Is this the same process you'd use to upgrade a disk? i.e. if I wanted to replace the 1TB with a 3TB or whatever?

Yes. But your parity drive has to be as large or larger than the largest data drive

And might I also say, Squid, thanks for not embarrassing me by identifying I pulled the Smart report on the wrong drive! Duh... Need to get more sleep!

#youguysaregreat!

Didn't even look at it

jeffreywhunter · May 25, 2015

The drive reconstruction ran for a few minutes then this displayed...

http://my.jetscreenshot.com/12412/20150525-qrq8-120kb.jpg

Disk ID shows this: http://my.jetscreenshot.com/12412/20150525-4zh5-36kb.jpg

But the cool thing is that my drive has been expanded to a 600PB drive!

http://my.jetscreenshot.com/12412/20150525-svby-48kb

Squid · May 25, 2015

And you're complaining???

Drive probably now has some issues.

Check out this: http://lime-technology.com/forum/index.php?topic=39827.0

sureguy · May 25, 2015

The drive reconstruction ran for a few minutes then this displayed...

http://my.jetscreenshot.com/12412/20150525-qrq8-120kb.jpg

Disk ID shows this: http://my.jetscreenshot.com/12412/20150525-4zh5-36kb.jpg

But the cool thing is that my drive has been expanded to a 600PB drive!

http://my.jetscreenshot.com/12412/20150525-svby-48kb

Im

.

Unraid 6 RC3

I have a 1TB drive that was accidently unplugged when a server was moved. It came up as failed (of course), so after some fiddling and a reboot or two I opened the case to see what the issue was. Unplugged. Easy right?

So I plugged it back in and rebooted. The drive then responded I did a smart check on it and left it to do what it does.

This morning I checked to see what's going on and the drive is still active, but...the drive icon is a grey triangle. That state is not on the list!

Short and extended self-tests ran with no problems.

http://my.jetscreenshot.com/12412/20150525-vfqk-47kb.jpg

Screenshot of Main

http://my.jetscreenshot.com/12412/20150524-oigx-94kb.jpg

http://my.jetscreenshot.com/12412/20150525-8eru-61kb.jpg

Screenshot of unmenu - disk array status - disk_dsbl?

http://my.jetscreenshot.com/12412/20150524-4pae-150kb.jpg

I've run Smart check a couple times, no errors. No errors reported.

Smart report, disk attributes & syslog zip attached.

Thanks in advance guys, for your wisdom and experience!

sureguy · May 25, 2015

Peter... P.........poolanything

Main reason why we can't do

.....mi

..

UK. Up

U. I 7

jeffreywhunter · May 25, 2015

Sureguy, strange, only bits of your typing came through. Like a bad phone connection...??!?

jeffreywhunter · May 25, 2015

I've replaced disk 6 and started a rebuild with another precleared drive. It immediately stated it was rebuilding then reported a command timeout 3.

http://my.jetscreenshot.com/12412/20150525-fzkq-139kb.jpg

All of these drives came from operational (and not in error) windows based system. All passed multiple Pre-clears. Why so many errors under Linux??!? Is it more picky or something?

Squid · May 25, 2015

Because windows never tells you about these things

garycase · May 25, 2015

This is the 2nd 600 PB drive we've seen !!

I wonder if there's some esoteric bug that's causing this ... seems very unlikely the drives are actually misreporting this -- the two drives involved are different models altogether.

jeffreywhunter · May 25, 2015

Well... I think this must be a controller problem. Came back after I had swapped a drive, replaced the cables and started the rebuild. A while after the rebuild started, the very same port is reporting bad data... So perhaps the Sata Adapter is bad? Syslog attached.

http://my.jetscreenshot.com/12412/20150525-ze7a-102kb.jpg

I guess this is what you get when you try to use up 'stuff' laying around. Probably better just to sell it on ebay and buy something new... sheesh...

syslog20150525-1.zip

garycase · May 25, 2015

Yep ... looks like a controller issue.

As for whether or not you should "... sell it on ebay ..." ==> let your conscious be your guide [i'd just trash it]

jeffreywhunter · May 25, 2015

Very true...but it works without issue under windows...

garycase · May 25, 2015

Ahh ... so it's a Linux driver issue and not an actual hardware problem.

In that case it's time to write your e-bay ad

jeffreywhunter · May 26, 2015

So are Linux drivers just more particular? Or is a better way to say, less tolerant of marginal hardware? I've had several drives that pass pre-clear several times go 'bad', but format just fine under Windows and don't have any chkdsk issues... Plus the side-track trying to get that ASUS P8Z68-v Pro running. I switched to the ASUS M5A97 EVO and its working for several days with no issues. Mind you its bare and just idling, but I couldn't get the P8z68 to do that!

Next step is to plant the AOC HBA and see if it pukes...

JonathanM · May 26, 2015

So are Linux drivers just more particular? Or is a better way to say, less tolerant of marginal hardware? I've had several drives that pass pre-clear several times go 'bad', but format just fine under Windows and don't have any chkdsk issues.

chkdsk is tolerant of errors that linux reports. Most times the drives own SMART reports are the best indicator of issues, either in windows or linux. Since windows doesn't need 100% of the drive to be flawless, a marginal drive will function just fine in a windows environment, but puke in an unraid server. Unraid requires 100% accuracy over the entire disk surface* of every drive to properly recover 1 failed drive. *(subject to the actual sizes of the drives involved)

Many years ago I made the mistake of putting some marginal drives in an unraid server thinking, "hey, if one fails, no big deal I'll just rebuild it." That logic failed bigtime when I had a second failure while trying to rebuild a drive. Now I no longer tolerate any marginal drives in my array. First sign of trouble, it's out of there. I've seen too many customers drives fail with very little or no warning, so I'm very conservative on my own servers.

trurl · May 26, 2015

So are Linux drivers just more particular? Or is a better way to say, less tolerant of marginal hardware? I've had several drives that pass pre-clear several times go 'bad', but format just fine under Windows and don't have any chkdsk issues... Plus the side-track trying to get that ASUS P8Z68-v Pro running. I switched to the ASUS M5A97 EVO and its working for several days with no issues. Mind you its bare and just idling, but I couldn't get the P8z68 to do that!

Next step is to plant the AOC HBA and see if it pukes...

Preclear (or clear) uses every part of a drive. Building or rebuilding parity or a data disk uses every part of every drive. I don't think Windows knows if a disk has problems until it tries to read or write data to a part that has problems, which may not ever happen. Chkdsk is only testing for file system problems unless you do a surface scan.

Drive Reconstruction Stalled (drive icon grey triangle?)

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation