Failed Controller, Failed Disk, New USB enclosure


Recommended Posts

ok Gurus here is my dilemma. I have been failing desk left and right. I replaced disks, replaced sata cables, and then finally the system will not boot because it hangs on the controller. Before that I had replaced the failed drive and started a rebuild only for it to hang on constant  read errors from disk 3. Disk 2 is the failed drive. 

 

Anyway i decided i needed a new controller and that I still need a better enclosure so I went with a USB3.0 eight disk encloser. problem is I need to get the system to recognize the new configuration with the disks coming off USB. I know i can use "new config" tool, but from what i read, once done i will not be able to rebuild the failed drive 2. Is this so?  

 

I read some where i can keep the assignments, as well as flag the parity disk as good parity. If so why would I not be able to rebuild disk2?

 

Is there any workarounds on this or am I going to just have to face the fact that disk2 and its data will now be lost?

Link to comment
7 minutes ago, dmangus said:

so I went with a USB3.0 eight disk encloser

USB, while it may work, is extremely problematic for array drives. It's just not a robust enough protocol for the demanding environment of multi disk RAID, or unraid.

 

Did you originally have all your drives directly connected to SATA ports?

 

Describe your hardware in much more detail, and maybe we can find a solution that will work.

Link to comment

yea, i have a dell optiplex 9020 as my server, and I cant remember the addition scsi controller. I believe its a marvel chipset though, 4 port. Server been running for a good 5 years.  5 drives, parity, 3x DATA, and cache. the last 3 months have been hell with failed drives. I basically had to restore files once as 2x drives failed at one point, all with high CRC errors. It start to become super coincidental that the same drive numbers would fail. It wasn't until the actual card went kaput that i new for sure it was the pcie scsi controller. 

 

I have a 4 bay disk enclosure which i have to run the power from the workstation out the back as well as the sata cables. so since i have to go with something different, I might as well go with a enclosure with its own power, 8 bays so i can go double parity. going through the whole process of restoring files from discovered chunks on the disk was hellish and I didn't want to go through that again.

 

I figured i would give usb3.0 a try, as I would have to find a good HBA/Controller, go though flashing it etc, I just wanted to keep it simple I guess. Plus i would still need to find a enclosure with it'ts own power and can support 5+ drives.

 

None the less it wouldn't change where i am now. I still need to get a new configuration and some how rebuild disk 2 off the current parity if I can. 

 

 

Link to comment
Just now, johnnie.black said:

You can do that with the invalid slot command, but need more details, like what Unraid release you are running and if using single or dual parity.

hey Ti-Ti, im running 6.6.7 with single parity drive. I know exactly what drive is parity and what data drive 1, 2, and 3 are. Drive 2 is brand new drive which needs to be rebuilt. Any help would be greatly appreciated.

😇

Link to comment

Assuming parity is valid you can do this:

 

-Tools -> New Config -> -Assign all disks, including new disk2
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):

mdcmd set invalidslot 2 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk2 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

Link to comment
Just now, johnnie.black said:

Assuming parity is valid you can do this:

 

-Tools -> New Config -> -Assign all disks, including new disk2
-Important - After checking the assignments leave the browser on that page, the "Main" page.

-Open an SSH session/use the console and type (don't copy/paste directly from the forum, as sometimes it can insert extra characters):


mdcmd set invalidslot 2 29

-Back on the GUI and without refreshing the page, just start the array, do not check the "parity is already valid" box (GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the invalid slot command, but they won't be as long as the procedure was correctly done), disk2 will start rebuilding, disk should mount immediately but if it's unmountable don't format, wait for the rebuild to finish and then run a filesystem check

hmmm sounds like a plan, i will let you know how it turns out. Thanks for the direction.

Link to comment

sigh... well I did the new config, and started to assign drives. When i assigned drive3, usb device 2, it vanished from the list. I was like wtf! ok so i thought maybe it spun down or something. I rebooted and now no USB disks are found, only my cache drive which is on the on board sata controller. tower-syslog-20190831-0029.zip

 

scratching my head hear. the device i have is Mediasonic-H82-SU3S2-ProBox

Link to comment

USB enclosures are not recommend for Unraid, one of the many reasons is that they have tendency to drop devices, though yours is showing a constant problem, disks are detect correctly, e.g.:

Aug 30 23:55:10 Tower kernel: scsi 7:0:0:0: Direct-Access     ST4000DM 004-2CV104       0125 PQ: 0 ANSI: 6
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: Attached scsi generic sg2 type 0
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] Very big device. Trying to use READ CAPACITY(16).
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] 7814037168 512-byte logical blocks: (4.00 TB/3.64 TiB)
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] 4096-byte physical blocks
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] Write Protect is off
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] Mode Sense: 67 00 10 08
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] No Caching mode page found
Aug 30 23:55:10 Tower kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through

Then there's this error:

 

Aug 30 23:55:10 Tower kernel: usb 4-6: Device not responding to setup address.
Aug 30 23:55:10 Tower kernel: usb 4-6: Device not responding to setup address.
Aug 30 23:55:11 Tower kernel: usb 4-6: device not accepting address 4, error -71
Aug 30 23:55:15 Tower kernel: usb usb4-port6: Cannot enable. Maybe the USB cable is bad?
Aug 30 23:55:19 Tower kernel: usb usb4-port6: Cannot enable. Maybe the USB cable is bad?
Aug 30 23:55:23 Tower kernel: usb usb4-port6: Cannot enable. Maybe the USB cable is bad?
Aug 30 23:55:23 Tower kernel: usb 4-6: USB disconnect, device number 4

 

And all disks drop out and capacity can't be read:

 

Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x01 driverbyte=0x00
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] tag#0 CDB: opcode=0x88 88 00 00 00 00 00 00 00 00 20 00 00 00 08 00 00
Aug 30 23:55:23 Tower kernel: print_req_error: I/O error, dev sdc, sector 32
Aug 30 23:55:23 Tower kernel: Buffer I/O error on dev sdc, logical block 4, async page read
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] Read Capacity(16) failed: Result: hostbyte=0x01 driverbyte=0x00
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] Sense not available.
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] Read Capacity(10) failed: Result: hostbyte=0x01 driverbyte=0x00
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] Sense not available.
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] 0 512-byte logical blocks: (0 B/0 B)
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] 4096-byte physical blocks
Aug 30 23:55:23 Tower kernel: sd 7:0:0:0: [sdc] Attached SCSI disk

Then the process starts again and keeps repeating, for all disks.

Edited by johnnie.black
Link to comment
1 hour ago, dmangus said:

great, story of my life, thanks. I guess i have no choice but to find a sata controller and a decent enclosure.

 

my current usb enclosure also supports esata, how about this?

It is definitely worth trying eSATA as that is more likely to function reliably from a connection viewpoint than USB.

Link to comment
On 9/2/2019 at 3:49 PM, Kevek79 said:

Didn't you start this whole adventure because of a non working Marvel controller?

So why replacing it with another one of those ?

yea right, well because it didn't give problems until it completely started to die, so I see that as manufacture issue, not a compatibility issue.  Anyway that controller didn't come through, amazon ran out of it. I picked up I/O Crest 2 Port SATA III and 2 Port eSATA, Yes another marvel chipset.

 

New issue though. So all the drives picked up this time except one. I am getting the following error for this one drive.

Sep  9 11:50:06 Tower kernel: ata7.02: READ LOG DMA EXT failed, trying PIO
Sep  9 11:50:06 Tower kernel: ata7.02: failed to read log page 10h (errno=-5)
Sep  9 11:50:06 Tower kernel: ata7.02: exception Emask 0x1 SAct 0x800 SErr 0x0 action 0x0
Sep  9 11:50:06 Tower kernel: ata7.02: configured for UDMA/133 (device error ignored)

soon as I assign the drive to disk 3, it vanishes. Any ideas?

tower-syslog-20190909-1151.zip

Link to comment
  • 3 months later...
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.