Drive Red Balled, Concerned I am about to lose data


Recommended Posts

Drive 12 failed during huge data transfer. Web GUI showed it red balled.

 

Attempted a swap-disable (5TB in as new parity, 2TB parity as replacement) but nothing happened when it rebooted. Array started on its own and it still showed 2TB as parity and 5TB as unassigned.

 

Not wanting to risk losing data, I shut down and swapped the disks back around (2TB back to parity and "failed" drive back in as disk 12, reseated the data cable on disk 12 (connected to AOC card) and restarted. One thing I noticed was the AOC-SASLP-MV8 didn't do its post on the screen! unRaid booted up and all the drives are visible but now disk 12 slot is "not installed" and suspect 2TB is "unassigned" (see pic).

 

What is my next step? I am pretty sure "new config" will wipe out my data? Have attached 2 syslogs, one was of the disk fail, the other the reboot attempt at swap-disable.

drive_failure_syslog-20141006-221302.zip

syslog.zip

Tower-Main_2014-10-06_23-02-54.png.907cb6504b2120fd9bcba45aae5f8b6a.png

Link to comment

I am pretty sure it was a loose data cable. Drive 12 is in slot next the one I added cache drive to couple days ago. May have jostled the cable connecting the cache disk (all 5in3 cages, bottom cage, furthest in the back connector). When I pressed on it got a click feel (locking connectors).

 

Need some advise here. I power down last night and have done nothing since I posted above.

Link to comment

You probably have to routes:

 

1. rebuild the disk12 disk either on the same disk or another drive from the emulated disk12 

2. Start with a new config and use the data from the physical drive 12.

 

If you are confident in the data on the physical disk12, I would verify the contents with the array offline, and use option 2.  If you put all disks back into the original slots and bring up with parity valid the parity check should clean up a any invalid parity bits.

Link to comment

Rebuild finished without error but the drive is being slow and unresponsive. Trying a reboot right now but it seems to be stuck trying to read from the drive. I have a login prompt now at the console but Tower is not showing on the network or webgui loading either. Going to shutdown and pull the disk and then reboot see if I can at least get the syslogs.

 

drive light is full on and getting a timeout error at the console.

syslog-20141007-211515.zip

syslog-20141007-213051.zip

Link to comment

If you have a spare drive, it would probably be good to rule out a disk issue.

 

It is strange, I don't see any errors in your logs.  Were you only seeing the slowness from that one drive, or all data access?

 

when its trying to access that drive (during boot it will pause long time, from the web gui it will hang, etc) the whole machine freezes up and becomes unresponsive. The 5in3 cage drive light is on full when it happens. I was able to shut down from the command line.

 

Will get a disk from Best Buy today and give that a try.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.