AeroSteveO Posted June 24, 2013 Share Posted June 24, 2013 I ran a parity check recently and one of my disks ended up being removed from my array due to write errors, this disk had been running fine until then for read/write with one of my virtual machines, so the write error came as a surprise. The strangest part of the problem is the output from the syslog, which I haven't seen before (i've had disk errors/failures before, but never this type of output). the output just repeats over and over, never changing, except for the write errors in the parity check. i've attached a shortened and the full syslog. i ran a smart test which the drive passed, so i'm rebuilding the disk right now. the hdd is a WD green drive thats been running smooth for about 1.5 years. the unraid version on my server is 5.0rc12. any ideas on why this happened/what happened? syslog_shortened.txt Link to comment
WeeboTech Posted June 24, 2013 Share Posted June 24, 2013 you may be developing a bad spot on that disk. Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263264/4, count: 1 Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263272/4, count: 1 Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263280/4, count: 1 Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263288/4, count: 1 Although you did a smart test, was it a long test? smartctl -t long Attach the current smart test. I would probably bring the array to an idle state without emhttp and do a smart -t long on it, then check the smart report for status. Link to comment
dgaschk Posted June 25, 2013 Share Posted June 25, 2013 Post a SMART report. The test can wait. Link to comment
AeroSteveO Posted June 25, 2013 Author Share Posted June 25, 2013 here's a short smart report, i started a long one however it hasn't finished yet for some reason, i'll have the long smart test done tomorrow if its still being slow edit: just got back to my server to find that the long smart test still hadn't finished, after 8 hours of it 'running', my server is giving odd segfaults occationally when the smart command is used as well Jun 25 15:32:36 ShadowOfIntent kernel: mdcmd (23): import 22 0,0 (unRAID engine) Jun 25 15:32:36 ShadowOfIntent kernel: mdcmd (24): import 23 0,0 (unRAID engine) Jun 25 15:32:36 ShadowOfIntent emhttp_event: driver_loaded (System) Jun 25 15:32:36 ShadowOfIntent kernel: smartctl[12019]: segfault at 52f0d641 ip 080aa050 sp bfacd6a0 error 6 (Errors) Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14316): rmmod md-mod |$stuff$ logger (Other emhttp) Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14317): modprobe md-mod super=/boot/config/super.dat slots=24 |$stuff$ logger (unRAID engine) Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14318): udevadm settle (Other emhttp) Jun 25 15:32:38 ShadowOfIntent kernel: md: unRAID driver removed (System) Jun 25 15:32:38 ShadowOfIntent kernel: md: unRAID driver 2.1.5 installed (System) smart.txt Link to comment
AeroSteveO Posted June 26, 2013 Author Share Posted June 26, 2013 After fighting with smartctl for a bit, i managed to get it to output a long smart report on the drive thats been giving me problems. smartreportlong.txt Link to comment
Joe L. Posted June 26, 2013 Share Posted June 26, 2013 This line in the smart report indicates the disk retracted the heads upon power failure 195 times. 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 195 You might want to check and re-seat the power cable to the drive (or any splitters involved, or drive trays involved, or backplanes) If there is a loose connection, or an intermittent cable, you'll find the disk failing intermittently. With the array stopped, but with the drives spinning, you might gently move the cables, you should not hear any disk spin down or retract the disk heads. (I had a splitter in my server that was improperly crimped at its connectors... nearly the same symptoms as you. It causes "hair-loss", as you'll pull your hair our trying to figure out what is happening) Joe L. Link to comment
AeroSteveO Posted June 26, 2013 Author Share Posted June 26, 2013 Now that you mention it, that hdd is plugged into one of my splitters, alongside my VM drive (not included in the array), i'll plug it into a different power connection, thanks for the help, i wouldn't have noticed that Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.