Jump to content

First time with hard drive errors and I have lots of questions


daniel.boone

Recommended Posts

Recently performed a swap of my cache drive. All went well with the cache. Its been running fine for over a week. So I proceed to run a  preclear on the old cache drive. This also goes well. I add the drive to my system in the hope of extending my array by 1 TB. This would be disk number 8, port provided via a monoprice PCI SIL generic sata card. Format goes well so I start adding content. At the 50GB mark I start to get errors

 

Mar 8 05:44:24 Tower kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Mar 8 05:44:24 Tower kernel: ata4.00: configured for UDMA/33
Mar 8 05:44:24 Tower kernel: ata4: EH complete
Mar 8 05:44:24 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Mar 8 05:44:24 Tower kernel: ata4.00: BMDMA2 stat 0x6d0009
Mar 8 05:44:24 Tower kernel: ata4.00: failed command: READ DMA EXT
Mar 8 05:44:24 Tower kernel: ata4.00: cmd 25/00:00:a7:72:cd/00:04:23:00:00/e0 tag 0 dma 524288 in
Mar 8 05:44:24 Tower kernel: res 51/04:8f:18:75:cd/00:01:23:00:00/e0 Emask 0x1 (device error)
Mar 8 05:44:24 Tower kernel: ata4.00: status: { DRDY ERR }
Mar 8 05:44:24 Tower kernel: ata4.00: error: { ABRT }
Mar 8 05:44:24 Tower kernel: ata4.00: configured for UDMA/33
Mar 8 05:44:24 Tower kernel: ata4: EH complete
Mar 8 05:44:55 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 8 05:44:55 Tower kernel: ata4.00: failed command: READ DMA EXT
Mar 8 05:44:55 Tower kernel: ata4.00: cmd 25/00:00:a7:72:cd/00:04:23:00:00/e0 tag 0 dma 524288 in
Mar 8 05:44:55 Tower kernel: res 40/00:8f:18:75:cd/00:01:23:00:00/e0 Emask 0x4 (timeout)
Mar 8 05:44:55 Tower kernel: ata4.00: status: { DRDY }
Mar 8 05:44:55 Tower kernel: ata4: hard resetting link
Mar 8 05:44:55 Tower kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Mar 8 05:44:55 Tower kernel: ata4.00: configured for UDMA/33
Mar 8 05:44:55 Tower kernel: ata4: EH complete
Mar 8 05:44:55 Tower kernel: ata4: drained 32768 bytes to clear DRQ.
Mar 8 05:44:55 Tower kernel: ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar 8 05:44:55 Tower kernel: ata4.00: failed command: READ DMA EXT
Mar 8 05:44:55 Tower kernel: ata4.00: cmd 25/00:00:a7:76:cd/00:04:23:00:00/e0 tag 0 dma 524288 in
Mar 8 05:44:55 Tower kernel: res ff/ff:ff:ff:ff:ff/ff:ff:ff:ff:ff/ff Emask 0x2 (HSM violation)
Mar 8 05:44:55 Tower kernel: ata4.00: status: { Busy }
Mar 8 05:44:55 Tower kernel: ata4.00: error: { ICRC UNC IDNF ABRT }
Mar 8 05:44:55 Tower kernel: ata4: hard resetting link

 

Now as the cache the drive worked like a champ. 7200 RPM Caviar Black.

 

Adding any new data gives me the same errors. I did a proper shut down and replaced the sata/power cables with others  known to be good. I still get errors.

 

I removed the drive and it seems to be well in a windows system. I add it back and run a repair. This runs fine. But the minute I try to add data I get the errors again.

 

 

Could it be the sata card causing all this trouble? Is it the motherboard reached some limit with 9 sata drives? I can add an IDE drive as a cache and its seems fine. What is the best way to test the drive outside of unRaid? If I remove the drive and run preclear it now hangs. Error log give me the same error messages as when adding data. How can I remove the unRaid signature so the drive is treated like new?

 

Right now all my data seems safe. I have a 80GB sata drive I'm planning to try out in a data rebuild.

 

TIA,

db

 

Link to comment

Thanks for the hints.

 

I was begining to be suspect of the card myself. What do you think about a MB swap at this point? I have a forum "tested" gigabyte MB with 8 onboard. Once I add my known good SIL card I  will have 10 ports. That is plenty for the drives I have now.

 

The PS is a 550 watt corsair. My rig seems to work fine if I add a ide drive for cache. I can't reformat it to test as it has some data on it.

 

Any ideas on how to "reset" a harddrive? I would like to start the drive as if it was new (new partition and format). Add it to the array and have it rebuild the data. It seems when I re-add the drive now it recognizes the unRaid sig and bypasses the format. I'm thinking gparted might help here.

 

Thanks db

 

Link to comment

I swapped out the MB 3AM this morning. This allowed for me to use the drive using the onboard sata. Its still early but errors seem to be gone.

 

Either the ASUS P5E-VM D0 didn't like the PCI controller or it was just defective. I replaced it with a GIGABYTE GA-EP45-UD3P. I'll test and post to results for inclusion on the HW page. I may try the PCI sata card again but for now I will give unRaid some time to settle. 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...