[Downgraded to Beta11] Can't Access GUI - Can Ping and Telnet Just fine...

SuperW2 · September 30, 2011

Was replacing a failed disk today (had a red-ball)...installed new 1.5TB - started a rebuild this AM before work, came home from work today to check on it, hoping it was done (started at like 800+ minutes), and unable to access my GUI.

I always access by IP and not DNS name, tried all 3 browsers, I can ping the server with normal response and telnet normally...

I pulled the Syslog.txt and attached, but honestly don't really know what I'm looking for.

EDIT The new disk is SDS - Disk17 if that makes any difference!

EDIT2: 5beta12

SuperW2 · September 30, 2011

I see this stuff, but no idea what disk ATA11 is, but it seems to be having some problems......

Sep 29 19:29:19 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Sep 29 19:29:19 Media kernel: ata11.00: configured for UDMA/133

Sep 29 19:29:19 Media kernel: scsi 12:0:0:0: Direct-Access ATA ST3750640AS 3.AA PQ: 0 ANSI: 5

Sep 29 19:29:19 Media kernel: scsi 13:0:0:0: Direct-Access ATA ST31000528AS CC36 PQ: 0 ANSI: 5

Sep 29 19:29:19 Media kernel: sd 12:0:0:0: [sdt] 1465149168 512-byte logical blocks: (750 GB/698 GiB)

Sep 29 19:29:19 Media kernel: sd 13:0:0:0: [sdu] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)

Sep 29 19:29:19 Media kernel: sd 13:0:0:0: [sdu] Write Protect is off

Sep 29 19:29:19 Media kernel: sd 13:0:0:0: [sdu] Mode Sense: 00 3a 00 00

Sep 29 19:29:19 Media kernel: sd 12:0:0:0: [sdt] Write Protect is off

Sep 29 19:29:19 Media kernel: sd 12:0:0:0: [sdt] Mode Sense: 00 3a 00 00

Sep 29 19:29:19 Media kernel: sd 13:0:0:0: [sdu] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Sep 29 19:29:19 Media kernel: sd 12:0:0:0: [sdt] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Sep 29 19:29:19 Media kernel: sdu: sdu1

Sep 29 19:29:19 Media kernel: sd 13:0:0:0: [sdu] Attached SCSI disk

Sep 29 19:29:19 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Sep 29 19:29:19 Media kernel: ata11.00: irq_stat 0x08000000

Sep 29 19:29:19 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Sep 29 19:29:19 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Sep 29 19:29:19 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in

Sep 29 19:29:19 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)

Sep 29 19:29:19 Media kernel: ata11.00: status: { DRDY }

Sep 29 19:29:19 Media kernel: ata11: hard resetting link

Sep 29 19:29:19 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Sep 29 19:29:19 Media kernel: ata11.00: configured for UDMA/133

Sep 29 19:29:19 Media kernel: ata11: EH complete

Sep 29 19:29:19 Media kernel: sdt: sdt1

Sep 29 19:29:19 Media kernel: sd 12:0:0:0: [sdt] Attached SCSI disk

Sep 29 19:29:19 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Sep 29 19:29:19 Media kernel: ata11.00: irq_stat 0x08000000

Sep 29 19:29:19 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Sep 29 19:29:19 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Sep 29 19:29:19 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in

Sep 29 19:29:19 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)

Sep 29 19:29:19 Media kernel: ata11.00: status: { DRDY }

Sep 29 19:29:19 Media kernel: ata11: hard resetting link

Sep 29 19:29:19 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Sep 29 19:29:19 Media kernel: ata11.00: configured for UDMA/133

Sep 29 19:29:19 Media kernel: ata11: EH complete

Sep 29 19:29:19 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Sep 29 19:29:19 Media kernel: ata11.00: irq_stat 0x08000000

Sep 29 19:29:19 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Sep 29 19:29:19 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Sep 29 19:29:19 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in

Sep 29 19:29:19 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)

Sep 29 19:29:19 Media kernel: ata11.00: status: { DRDY }

Sep 29 19:29:19 Media kernel: ata11: hard resetting link

Sep 29 19:29:19 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Sep 29 19:29:19 Media kernel: ata11.00: configured for UDMA/133

Sep 29 19:29:19 Media kernel: ata11: EH complete

Sep 29 19:29:19 Media kernel: ata11: limiting SATA link speed to 1.5 Gbps

Sep 29 19:29:19 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Sep 29 19:29:19 Media kernel: ata11.00: irq_stat 0x08000000

Sep 29 19:29:19 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Sep 29 19:29:19 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Sep 29 19:29:19 Media kernel: ata11.00: cmd 60/08:00:00:66:54/00:00:57:00:00/40 tag 0 ncq 4096 in

Sep 29 19:29:19 Media kernel: res 40/00:00:00:66:54/00:00:57:00:00/40 Emask 0x10 (ATA bus error)

Sep 29 19:29:19 Media kernel: ata11.00: status: { DRDY }

Sep 29 19:29:19 Media kernel: ata11: hard resetting link

Sep 29 19:29:19 Media kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep 29 19:29:19 Media kernel: ata11.00: configured for UDMA/133

Sep 29 19:29:19 Media kernel: ata11: EH complete

SuperW2 · September 30, 2011

bump...any help or ideas?

SuperW2 · September 30, 2011

One more update (I'll keep talking to myself)...

I hard booted the server, I decided it was hung... The GUI came back up after room, started another rebuid last night and it hung again in the same way this AM. Not sure if the rebuild of Disk 17 finished or not, but I see folders/files and a couple hundred gigs of files on it when I browse (with Array started but unprotected).

Appears could be issues with one of the 2 new 1.5TB Drives I installed (Disk 17/18)... I went to try to copy the data from them to other free space on other drives, and it appears the server hung again during file copy, lost GUI in the same fashion. I see a ton of SMART errors on Disk18 (and it's new and was precleared). It's pluged in via a 4 Port SAStoSATA cable, which I also switched last night with another new one.

So at this point, I have a Yellow Ball on disk 17, unsure if data rebuild has been completed on it, potentially a dead/dying Disk 17 and/or 18 that I can get the data off of before the GUI Hangs/Crashes again. If I could get the data off and have my Array protected again, I'd pull both of those disks from the array and troubleshoot off line, but at this point that doesn't seem like it's going to happen.

HELP!

dgaschk · September 30, 2011

The syslog shows a few things.

1. It appears the lost power or was shutdown incorrectly previous to this log. How are you shutting down? You must shutdown using the button on the webGUI or install the powerdown script.

2. The (ATA bus error) and (HSM violation) indicate bad or lose SATA cables (http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues). Three drives are giving these types of errors which makes me think it's a power supply problem. What PSU is in the system? Unfortunately the log does not contain a direct mapping from ataXX to sdX. I've provided the info the syslog does contain below:

ata1: maybe sda?

ata1.00: ATA-7: ST3750640AS, 3.AAE, max UDMA/133

ata1.00: 1465149168 sectors, multi 0: LBA48 NCQ (depth 31/32)

ata5: ata5.00: ATA-7: ST3750640AS, 3.AAE, max UDMA/133

ata5.00: 1465149168 sectors, multi 0: LBA48 NCQ (depth 31/32)

ata19: ata19.00: ATA-8: ST1500DL003-9VT16L, CC32, max UDMA/133

ata19.00: 2930277168 sectors, multi 0: LBA48 NCQ (depth 31/32)

EDIT: fixed typos. P.S. check your messages or email.

SuperW2 · September 30, 2011

updated syslog

syslog-2011-09-30.txt

dgaschk · September 30, 2011

The new syslog shows problems with ata1 and ata11. Which PSU is in the system?

SuperW2 · October 2, 2011

The new syslog shows problems with ata1 and ata11. Which PSU is in the system?

OCZ 700W - OCZ700MXSP - http://www.newegg.com/Product/Product.aspx?Item=N82E16817341018

prostuff1 · October 2, 2011

The new syslog shows problems with ata1 and ata11. Which PSU is in the system?

OCZ 700W - OCZ700MXSP - http://www.newegg.com/Product/Product.aspx?Item=N82E16817341018

That PSU has 2 12v rails with 25Amp each. The mac number of drives would be 10 green and 7 7200RPM drives.

Even if those rails are not separate then you have 50 amps which would cover 20 Green and 15 7200RPM.

Give us a breakdown of your drives, but it looks like a new PSU might be in order.

Joe L. · October 2, 2011

The new syslog shows problems with ata1 and ata11. Which PSU is in the system?

OCZ 700W - OCZ700MXSP - http://www.newegg.com/Product/Product.aspx?Item=N82E16817341018

That PSU has 2 12v rails with 25Amp each. The mac number of drives would be 10 green and 7 7200RPM drives.

Even if those rails are not separate then you have 50 amps which would cover 20 Green and 15 7200RPM.

Give us a breakdown of your drives, but it looks like a new PSU might be in order.

His syslog shows 19 disks, even if all green, that would need a 38 Amp supply for just the disks, but the rail used by the disks is also shared by the motherboard and fans. If not all "green" then something more like 65 Amps is probably needed.

A different (high current, single 12 volt rail) power supply is definitely in order. I'd look for something with 45 Amps or more capacity, depending on the mix of "green" and "non-green" drives.. ( 2 Amps per green drive, 3 per non-green. )

SuperW2 · October 2, 2011

The new syslog shows problems with ata1 and ata11. Which PSU is in the system?

OCZ 700W - OCZ700MXSP - http://www.newegg.com/Product/Product.aspx?Item=N82E16817341018

That PSU has 2 12v rails with 25Amp each. The mac number of drives would be 10 green and 7 7200RPM drives.

Even if those rails are not separate then you have 50 amps which would cover 20 Green and 15 7200RPM.

Give us a breakdown of your drives, but it looks like a new PSU might be in order.

His syslog shows 19 disks, even if all green, that would need a 38 Amp supply for just the disks, but the rail used by the disks is also shared by the motherboard and fans. If not all "green" then something more like 65 Amps is probably needed.

A different (high current, single 12 volt rail) power supply is definitely in order. I'd look for something with 45 Amps or more capacity, depending on the mix of "green" and "non-green" drives.. ( 2 Amps per green drive, 3 per non-green. )

20 Drives total with Parity and Cache

5 are "green" 5900rpm and 15 are 7200rpm... Plus the 9 Case fans, CPU fan, etc.

Odd thing is I've not had any problems like this (or at least that I knew about) until I recently upgraded the 5 green drives.

Any suggestions for Power Supply... I'm not seeing the Amp rating on PSU's on NewEgg...at least that I understand...

?

+3.3V@30A, +5V@30A, +12V@80A, [email protected], [email protected]

prostuff1 · October 2, 2011

20 Drives total with Parity and Cache

5 are "green" 5900rpm and 15 are 7200rpm... Plus the 9 Case fans, CPU fan, etc.

Odd thing is I've not had any problems like this (or at least that I knew about) until I recently upgraded the 5 green drives.

Any suggestions for Power Supply... I'm not seeing the Amp rating on PSU's on NewEgg...at least that I understand...

?

+3.3V@30A, +5V@30A, +12V@80A, [email protected], [email protected]

It’s a small miracle you have not had any problems show until now. you need 45A for the non-green drives and another 10A for the green drives. So you have 55A JUST for you drives. Then you have the motherboard and what not to power also.

You need to look at the +12V line and in your example above that is listed at 80A. That would be plenty for your drives.

SuperW2 · October 2, 2011

Thanks, I'll upgrade the PSU and hope that resolves the drive issues.

To Verify... is 80A overkill, or what I need with the mix of drives plus my Fans and stuff?

Would Something like this be OK?

http://www.newegg.com/Product/Product.aspx?Item=N82E16817171049

Single 12V Rail and +12V@80A

prostuff1 · October 2, 2011

Thanks, I'll upgrade the PSU and hope that resolves the drive issues.

To Verify... is 80A overkill, or what I need with the mix of drives plus my Fans and stuff?

Would Something like this be OK?

http://www.newegg.com/Product/Product.aspx?Item=N82E16817171049

Single 12V Rail and +12V@80A

Look for something in the 65-70A range

dgaschk · October 3, 2011

That power supply is not appropriate for unRAID. It has 2 12V busses at 25A each. 25A is not nearly enough to run all those drives. See here: http://lime-technology.com/forum/index.php?topic=12219.0

SuperW2 · October 3, 2011

That power supply is not appropriate for unRAID. It has 2 12V busses at 25A each. 25A is not nearly enough to run all those drives. See here: http://lime-technology.com/forum/index.php?topic=12219.0

Which, the one I own, or the one I linked a couple posts ago asking if it would be a better PSU???

I still find it odd that I had no issues until I replaced 5 non-green 7200 rpm drives with 5 green 5900 RPM ones over the past few weeks...

prostuff1 · October 3, 2011

That power supply is not appropriate for unRAID. It has 2 12V busses at 25A each. 25A is not nearly enough to run all those drives. See here: http://lime-technology.com/forum/index.php?topic=12219.0

Which, the one I own, or the one I linked a couple posts ago asking if it would be a better PSU???

I still find it odd that I had no issues until I replaced 5 non-green 7200 rpm drives with 5 green 5900 RPM ones over the past few weeks...

He is talking about the old supply. And the not having issue part is/was most likely dumb luck.

I have a feeling that the supply you have is actually a single rail but even then 50A for everything is not enough. You have too many drives for 50A.

SuperW2 · October 3, 2011

That power supply is not appropriate for unRAID. It has 2 12V busses at 25A each. 25A is not nearly enough to run all those drives. See here: http://lime-technology.com/forum/index.php?topic=12219.0

Which, the one I own, or the one I linked a couple posts ago asking if it would be a better PSU???

I still find it odd that I had no issues until I replaced 5 non-green 7200 rpm drives with 5 green 5900 RPM ones over the past few weeks...

He is talking about the old supply. And the not having issue part is/was most likely dumb luck.

I have a feeling that the supply you have is actually a single rail but even then 50A for everything is not enough. You have 2 many drives for 50A.

Well, I have a new 80A one coming on Tuesday, so I'll find out.

SuperW2 · October 4, 2011

Things appear to be getting worse... Just installed a new Cooler Master COOLER MASTER Silent Pro RSA00-AMBAJ3-US Power Supply.

Double, triple and quadruple checked (and swapped) all power and SATA cables and can only get 2 of the 4 drives plugged into my Adaptec 1430 card to appear in it's boot BIOS Screen...

Since the drives are running out of a IcyDock 5 in 3 cage, I'm wondering if the power adapter at the back of the cage is dead (the 2 drives that are not booting are right next to each other on the same 5-in-3 cage)...That or the nearly brand new Adaptec card took a dump, which I guess is possible

REALLY REALLY REALLY Frustrating...

So now I have a Yellow and 2 Red Balls preventing me from basically doing anything!!!

*** EDIT *** Ok, so I swapped the SATA Cables from the 0/1 ports with the 2/3 ports on my Adaptec Card and it now registers 0/1 as live and 2/3 as not... (it was the other way around before).

So it's not my Adaptec Card and more likely the IcyDock Cage Power that has died and no longer powering the 2 drives on the right side of the cage... Awesome!

*** EDIT #2 *** Bay's 4 and 5 are dead in my IcyDock cage...regardless of which drives are in those spots, or which SATA cables are used, I get nothing in those 2 slots. I'll try to get a warranty replacement, but not holding my breath.

*** EDIT #3 *** Removed all drives from Icy Dock, plugged directly into SATA/Power and all come up... BUT still getting Red Ball on Disk 9 and the Same Yellow I was getting on disk 17 from the start of this thread. For some reason, both Disk 9 and Disk 17 show as unformatted. I don't want too/can't lose the data on those 2 drives. Can I plug those drives seperatly into a Windows or Linux Box to recover data from them before UNRaid trys to format them?

Below is exactly how it sits right this second!

10-4-2011%2525204-53-02%252520PM.jpg

SuperW2 · October 4, 2011

My Current Boot Syslog Attached

syslog-2011-10-04.txt

SuperW2 · October 5, 2011

Bump.... Anything else weird on Syslog I should know about?

I've plugged Drive 9 and 17 into my Windows Box and backing up data with ReiserFS tool thing. I"ll likely wipe and remove them from the Array... rebuild Parity without them, and re-add.

I have a new IcyDock cage that should be here today.

dgaschk · October 5, 2011

You could format disk 17 and then rebuild disk 9. Disk 9 will remain red until it is rebuilt.

mbryanr · October 5, 2011

This is disk17

Oct 4 16:48:13 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Oct 4 16:48:13 Media kernel: ata11.00: irq_stat 0x08000000

Oct 4 16:48:13 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Oct 4 16:48:13 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Oct 4 16:48:13 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in

Oct 4 16:48:13 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)

See link for above...

http://lime-technology.com/wiki/index.php?title=The_Analysis_of_Drive_Issues#Drive_Interface_Issues

Oct 4 16:48:13 Media kernel: ata11.00: qc timeout (cmd 0x27)

Oct 4 16:48:13 Media kernel: ata11.00: failed to read native max address (err_mask=0x4)

Oct 4 16:48:13 Media kernel: ata11.00: HPA support seems broken, skipping HPA handling

Oct 4 16:48:13 Media kernel: ata11.00: revalidation failed (errno=-5)

Oct 4 16:48:13 Media kernel: ata11: hard resetting link

Oct 4 16:48:13 Media kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

Oct 4 16:48:13 Media kernel: ata11.00: configured for UDMA/133

Oct 4 16:48:13 Media kernel: ata11: EH complete

Oct 4 16:48:13 Media kernel: ata11: limiting SATA link speed to 1.5 Gbps

Oct 4 16:48:13 Media kernel: ata11.00: exception Emask 0x10 SAct 0x1 SErr 0x780100 action 0x6

Oct 4 16:48:13 Media kernel: ata11.00: irq_stat 0x08000000

Oct 4 16:48:13 Media kernel: ata11: SError: { UnrecovData 10B8B Dispar BadCRC Handshk }

Oct 4 16:48:13 Media kernel: ata11.00: failed command: READ FPDMA QUEUED

Oct 4 16:48:13 Media kernel: ata11.00: cmd 60/08:00:00:00:00/00:00:00:00:00/40 tag 0 ncq 4096 in

Oct 4 16:48:13 Media kernel: res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)

Oct 4 16:48:13 Media kernel: ata11.00: status: { DRDY }

Oct 4 16:48:13 Media kernel: ata11: hard resetting link

Oct 4 16:48:13 Media kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Oct 4 16:48:13 Media kernel: ata11.00: configured for UDMA/133

Oct 4 16:48:13 Media kernel: ata11: EH complete

Originally disk9 gets identified with the correct size

Oct 4 16:48:13 Media kernel: sd 3:0:0:0: [sdd] 1465149168 512-byte logical blocks: (750 GB/698 GiB)

but later the same error as disk17

<Line 1476: Oct 4 16:48:59 Media logger: mount: wrong fs type, bad option, bad superblock on /dev/md17>

Oct 4 16:48:59 Media logger: mount: wrong fs type, bad option, bad superblock on /dev/md9,

Oct 4 16:48:59 Media logger: missing codepage or helper program, or other error

Oct 4 16:48:59 Media logger: In some cases useful info is found in syslog - try

Oct 4 16:48:59 Media logger: dmesg | tail or so

dgaschk · October 5, 2011

You may need to run reiserfsck to retrieve any data on this drives.

SuperW2 · October 5, 2011

You may need to run reiserfsck to retrieve any data on this drives.

Should I do that before I wipe/reformat and try to rebuild?

[Downgraded to Beta11] Can't Access GUI - Can Ping and Telnet Just fine...

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Archived