Jump to content

Errors on drive - What is the real culprit


walterg74

Recommended Posts

 

Well I had just posted my issue, and tried to upload a syslog but the board said it was too big (by 4k) and it deleted my entire post.. Geez... so I'll try to make it short...

 

I can't piece togeher from the wiki/forum if I'm having disk issues or controller issues.

 

My setup:

 

- Motherboard with only TWO SATA ports

- Addon card with 4 SATA ports

- 3 500GB disks

- Parity connected to the addon card

- Data disk 1 connected to port 1 on the motherboard

- Data disk 2 connected to port 2 on the motherboard

 

Got errors some days on disk 2. Only way around it was to power down, disconnect the drive (as it wasn't recognized on restart not even at POST), reconnect and re-poweron...

 

Bought a 1TB disk to replace it, except due to the way this works, it had to replace the parity disk, and the parity disk would replace the failed disk 2. I connected the new disk where the parity one was, and the "old" parity one where the failed disk 2 was. Redid everything and worked just fine. Or so I thought.

 

A couple of days ago, I started getting errors again, and once more on the "same" disk 2... so being it's a different disk, I started wondering if it's not a faulty port on the motherboard, and I should replace the motherboard instead of the disks!

 

I have bought twwo more 1TB disks to "upgrade" my array to 3 1TB disks, but I don't want to continue until I figure out what is causing the problem. I managed to get a syslog which seems to have errors, but I just can't figure it  out.

 

Can anyone knowledgeable in this tell by my syslog what the error is?  (or if I need to provide more info, I'll gladly do so).

 

Thanks!!  :-\

syslog-20100403-212626.zip

Link to comment

UPDATE:

 

Ok, so one thing I forgot to mention, is that just in case it was the motherboard controller, upon the last failure I decided to connect the failed disk to the addon card instead of back to the motherboard.

 

Sadly, I have JUST NOW got errors again...

 

I went into the array through windows explorer, attempted to create a new folder on one of the shares, and it froze for a while (that's not a good sign, huh?). Next thing I know, when I got control back, checking unRAID mgmt interface I had 10 errors on disk 2 and it was disabled.... So... I know I can't blame the motherboard port as it's not being used... I only have 3 possibilites now...

 

1) The motherboard is somehow not supplying the voltage correctly?

2) The PSU is faulty  (it is though a brand new PSU, OCZ Stealthextreme 600W)

3) The cable is faulty?

 

or maybe it is bad luck and the disks are crap? Although again, when it was being used as the parity disk it never gave an error ever...

 

I captured the syslog, and I left it also in word format to leave the color coding, so here goes...

 

 

Here is a small excertp:

 

System Log (last 6 lines)  Legend => Errors Minor Issues Lime Tech unRAID engine System Drive related Network Logins Misc Other emhttp Jun 16 14:50:37 Tower kernel: ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen

Jun 16 14:50:37 Tower kernel: ata2: SError: { PHYRdyChg CommWake }

Jun 16 14:50:37 Tower kernel: ata2: hard resetting link

Jun 16 14:50:39 Tower kernel: ata2: SATA link down (SStatus 1 SControl 310)

Jun 16 14:50:39 Tower kernel: ata2: EH complete

Jun 16 14:50:43 Tower unmenu[1151]: cat: /sys/block/sdb/stat: No such file or directory

--------------------------------

Latest_Syslog_With_Error.doc

syslog-2010-06-16.txt

Link to comment

What motherboard is it exactly? It wouldn't be a Gigabyte board, would it?

 

Hmm...

 

Well, this particular one is not, it is an Asus "A7N8X-E Deluxe"

 

Why???  ??? (When I first thought it was the motherboard connector I decided to buy a replacement motherboard, and I bought a brand new Gigabyte G41M-E2SL, and it's right there in the box waiting to be installed...)

Link to comment

Have you checked the hardware compatibility list here:

http://lime-technology.com/wiki/index.php?title=Hardware_Compatibility

 

One possible clue:

 

"nVidia nForce series 5 or above (nForce4 or below NOT recommended, except Asus A8N-SLI)"

And the SATA ports are SATA 150 so you may have to find a way to put your HDs into SATA 1 mode...

 

 

And now you bought Gigabyte board - read first about the "HPA" feature in the forums and then decide on how to proceed.

 

Good luck

 

Link to comment

Have you checked the hardware compatibility list here:

http://lime-technology.com/wiki/index.php?title=Hardware_Compatibility

 

One possible clue:

 

"nVidia nForce series 5 or above (nForce4 or below NOT recommended, except Asus A8N-SLI)"

And the SATA ports are SATA 150 so you may have to find a way to put your HDs into SATA 1 mode...

 

 

And now you bought Gigabyte board - read first about the "HPA" feature in the forums and then decide on how to proceed.

 

Good luck

 

 

 

Hmm... well, what clue is that? I mean, that motherboard doesn't have an nforce SATA controller, it has a Silicon Image one:

 

Silicon Image Sil 3112A RAID Controller:

2 x Serial ATA

 

Additionally, it seems weird like I said in my OP that it's alway the seocond disk with an issue. I *thought* it could be the port on the motherboard, but like I said in my updated second post, I had last switched it to the addon card, and it threw errors yesterday again...

Link to comment

Have you tried different PSUs?

 

 

Nope, this is a brand new one, but of course, it could be faulty. Any way to test before actually swapping the psu? (since the problem occurs sporadically).  I guess that would be my next thing to try, cause if it's a different physical disk, it's now on a different port (a card even), psu could be the culprit... (still, keeps me wondering why the other disks were never affected though).

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...