mbryanr Posted September 8, 2011 Share Posted September 8, 2011 Well, I was in the process of ripping a mkv to my unRAID when I noticed that the writes were pausing, and incredibly slow. I stopped the rip process to investigate. I reviewed the syslog and saw the first error: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Sep 7 21:26:05 Tower kernel: ata8.00: irq_stat 0x48000000 Sep 7 21:26:05 Tower kernel: ata8.00: failed command: READ DMA EXT I saw these repeated many times: handle_stripe read error: 913706824/6, count: 1 So...I ran a smart test - which completed with this message: Completed: read failure 90% and found 15 pending sectors on sdh (Samsung HD204UI). I then rebooted, and have started a "parity check - no correct" What should I do next? Build: Version: unRAID 4.7 Motherboard: Asus M4A785-M PSU: Corsair CX430 CPU: AMD Sempron 140 Memory: G.SKILL 4GB (2 x 2GB) 240-Pin DDR2 Add-on controller: JMB362 Flash Drive: Kingston DataTraveler 101 Gen 2 Edit: Another concern is that is throwing up read errors in my syslog...won't be long before it becomes unresponsive. Also, the parity check-no correct is running really slow <see attached> unRAID_HD_Failure.zip Link to comment
SSD Posted September 8, 2011 Share Posted September 8, 2011 The drive has some pending sectors, but otherwise looks okay. With the syslog errors you are seeing, I suspect some kind of cabling problem to the drive (either sata or power connection may be flakey). I'd try reseating the drive and replacing (or resecuring) the cables and run a few parity checks to see if the problem is fixed. Link to comment
mbryanr Posted September 8, 2011 Author Share Posted September 8, 2011 Thinking along the same lines...except the pending sectors have increased from 0 to 14 to 40 in the last 2 hours. Checking cables once the daughter finishes watching a movie. <why now? ) an_hour_later_smart.txt Link to comment
mbryanr Posted September 8, 2011 Author Share Posted September 8, 2011 Have a replacement on the way.. My procedure (please correct if wrong) (I'm out of ports except another JMB363) 1 Stop the array 2 Power down 3 Connect another JMB363 to motherboard 4 Connect new drive to new JMB363 5 Turn on 6 Pre-clear new drive 7 Power down 8 Replace failed drive with new <pre-cleared> drive 9 Connect failed drive to new JMB363 10 Turn on Replaced drive appears with blue dot 12 Tick the "I'm sure" checkbox, and press "Start will bring the array on-line, start Data-Rebuild, and then expand the file system." Hefty disk activity and main page will show lots of reading on "the other" disks and writing on new disk as data is being rebuilt. 13 Pre-clear/Thrash failed drive to ensure Samsung replacement.. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.