Unraid Dropping From Network


Recommended Posts

I have been having problems the last 2 days with my server not replying. I have to manual reboot after trying to reboot via shell in a box. It runs ok for a few hours after reboot but then becomes unresponsive. I thought it was network related but the other 4 pc's on network are fine. Wondering if it could be my pci NIC card. I am attaching the syslog from the last start up. I do not see any syslogs on my flash in the 2-3 weeks.

syslog-2012-04-01.txt

Link to comment

I have been having problems the last 2 days with my server not replying. I have to manual reboot after trying to reboot via shell in a box. It runs ok for a few hours after reboot but then becomes unresponsive. I thought it was network related but the other 4 pc's on network are fine. Wondering if it could be my pci NIC card. I am attaching the syslog from the last start up. I do not see any syslogs on my flash in the 2-3 weeks.

 

syslogs are not saves on the flash drive automatically.  They are in /var/log/

 

This drive:

Apr  1 19:53:44 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)

Apr  1 19:53:44 Tower kernel: ata1.00: ATA-8: Hitachi HDS5C3020ALA632, ML6OA180, max UDMA/133

Apr  1 19:53:44 Tower kernel: ata1.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 31/32)

Apr  1 19:53:44 Tower kernel: ata1.00: configured for UDMA/100

Apr  1 19:53:44 Tower kernel: scsi 0:0:0:0: Direct-Access    ATA      Hitachi HDS5C302 ML6O PQ: 0 ANSI: 5

Apr  1 19:53:44 Tower kernel: sd 0:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)

Apr  1 19:53:44 Tower kernel: sd 0:0:0:0: [sdb] Write Protect is off

Apr  1 19:53:44 Tower kernel: sd 0:0:0:0: [sdb] Mode Sense: 00 3a 00 00

Apr  1 19:53:44 Tower kernel: sd 0:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FU

 

Is having lots of CRC errors (checksum errors in comminucating with it over the SATA link)

It is your parity drive

Apr  1 19:56:31 Tower kernel: md: import disk0: [8,16] (sdb) Hitachi_HDS5C3020ALA632_ML0220F30B4EJD size: 195351455

 

Apr  1 19:57:25 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x6
Apr  1 19:57:25 Tower kernel: ata1.00: irq_stat 0x00020002, device error via D2H FIS
Apr  1 19:57:25 Tower kernel: ata1: SError: { 10B8B BadCRC }
Apr  1 19:57:25 Tower kernel: ata1.00: failed command: READ DMA EXT
Apr  1 19:57:25 Tower kernel: ata1.00: cmd 25/00:70:f0:16:37/00:01:00:00:00/e0 tag 0 dma 188416 in
Apr  1 19:57:25 Tower kernel:          res 51/84:41:1f:17:37/00:01:00:00:00/00 Emask 0x10 (ATA bus error)
Apr  1 19:57:25 Tower kernel: ata1.00: status: { DRDY ERR }
Apr  1 19:57:25 Tower kernel: ata1.00: error: { ICRC ABRT }
Apr  1 19:57:25 Tower kernel: ata1: hard resetting link
Apr  1 19:57:28 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr  1 19:57:28 Tower kernel: ata1.00: configured for UDMA/100
Apr  1 19:57:28 Tower kernel: ata1: EH complete
Apr  1 19:57:32 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x6
Apr  1 19:57:32 Tower kernel: ata1.00: irq_stat 0x00020002, device error via D2H FIS
Apr  1 19:57:32 Tower kernel: ata1: SError: { 10B8B BadCRC }
Apr  1 19:57:32 Tower kernel: ata1.00: failed command: READ DMA EXT
Apr  1 19:57:32 Tower kernel: ata1.00: cmd 25/00:f0:48:f3:3e/00:03:00:00:00/e0 tag 0 dma 516096 in
Apr  1 19:57:32 Tower kernel:          res 51/84:a1:97:f4:3e/00:02:00:00:00/00 Emask 0x10 (ATA bus error)
Apr  1 19:57:32 Tower kernel: ata1.00: status: { DRDY ERR }
Apr  1 19:57:32 Tower kernel: ata1.00: error: { ICRC ABRT }
Apr  1 19:57:32 Tower kernel: ata1: hard resetting link
Apr  1 19:57:34 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr  1 19:57:34 Tower kernel: ata1.00: configured for UDMA/100
Apr  1 19:57:34 Tower kernel: ata1: EH complete
Apr  1 19:57:38 Tower last message repeated 34 times
Apr  1 19:57:38 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x6
Apr  1 19:57:38 Tower kernel: ata1.00: irq_stat 0x00020002, device error via D2H FIS
Apr  1 19:57:38 Tower kernel: ata1: SError: { 10B8B BadCRC }
Apr  1 19:57:38 Tower kernel: ata1.00: failed command: READ DMA EXT
Apr  1 19:57:38 Tower kernel: ata1.00: cmd 25/00:00:48:ef:46/00:04:00:00:00/e0 tag 0 dma 524288 in
Apr  1 19:57:38 Tower kernel:          res 51/84:e1:67:ef:46/00:03:00:00:00/00 Emask 0x10 (ATA bus error)
Apr  1 19:57:38 Tower kernel: ata1.00: status: { DRDY ERR }
Apr  1 19:57:38 Tower kernel: ata1.00: error: { ICRC ABRT }
Apr  1 19:57:38 Tower kernel: ata1: hard resetting link
Apr  1 19:57:41 Tower kernel: ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
Apr  1 19:57:41 Tower kernel: ata1.00: configured for UDMA/100
Apr  1 19:57:41 Tower kernel: ata1: EH complete
Apr  1 19:57:44 Tower kernel: ata1: limiting SATA link speed to 1.5 Gbps
Apr  1 19:57:44 Tower kernel: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x6
Apr  1 19:57:44 Tower kernel: ata1.00: irq_stat 0x00020002, device error via D2H FIS
Apr  1 19:57:44 Tower kernel: ata1: SError: { 10B8B BadCRC }
Apr  1 19:57:44 Tower kernel: ata1.00: failed command: READ DMA EXT
Apr  1 19:57:44 Tower kernel: ata1.00: cmd 25/00:00:48:7b:4d/00:04:00:00:00/e0 tag 0 dma 524288 in
Apr  1 19:57:44 Tower kernel:          res 51/84:b1:97:7c:4d/00:02:00:00:00/00 Emask 0x10 (ATA bus error)
Apr  1 19:57:44 Tower kernel: ata1.00: status: { DRDY ERR }
Apr  1 19:57:44 Tower kernel: ata1.00: error: { ICRC ABRT }
Apr  1 19:57:44 Tower kernel: ata1: hard resetting link
Apr  1 19:57:46 Tower kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 10)
Apr  1 19:57:46 Tower kernel: ata1.00: configured for UDMA/100
Apr  1 19:57:46 Tower kernel: ata1: EH complete

 

Usually, this is bad cabling, or cabling picking up noise from wires it is bundled with, or, a bad power supply or one unable to supply the drives clean power.

 

 

Link to comment

 

Is having lots of CRC errors (checksum errors in comminucating with it over the SATA link)

It is your parity drive

Apr  1 19:56:31 Tower kernel: md: import disk0: [8,16] (sdb) Hitachi_HDS5C3020ALA632_ML0220F30B4EJD size: 195351455

 

 

Usually, this is bad cabling, or cabling picking up noise from wires it is bundled with, or, a bad power supply or one unable to supply the drives clean power.

 

Thanks a lot for helping me out. OK I had an extra sata cable and replaced it for the old one, what should I do now to see if it fixed the problem?

 

I really do not know how to check the psu. You have any instructions on how to rule it a faulty psu?

 

I also attached the new syslog after a new sata cable and reboot.

syslog-2012-04-01_1.txt

Link to comment

OK I checked cables and they are fine. I bought and installed a new psu and I am having the same problems. The server will run for a few hours and then become unresponsive. I try rebooting through shell in a box and the server does not reboot. I have to manually shut it down and restart the server then it will work again. It will run for a few hours and then the same problem. I do have a PIC NIC card installed because the mobo nic card was not working.

Link to comment

OK I checked cables and they are fine. I bought and installed a new psu and I am having the same problems. The server will run for a few hours and then become unresponsive. I try rebooting through shell in a box and the server does not reboot. I have to manually shut it down and restart the server then it will work again. It will run for a few hours and then the same problem. I do have a PIC NIC card installed because the mobo nic card was not working.

 

Sounds like my experience too.

Link to comment

OK last night I moved the parity drive cable from the SATA2 Serial ATA II PCI-Express RAID Controller Card and plugged it into the mobo. The server has been up and running all night and I am not getting multiple emails about resync rebuilding. We were not using the server though but I am hoping this fixed the problem.

 

Could it be the RAID card is bad or maybe too slow?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.