Disk write error

AeroSteveO · June 24, 2013

I ran a parity check recently and one of my disks ended up being removed from my array due to write errors, this disk had been running fine until then for read/write with one of my virtual machines, so the write error came as a surprise. The strangest part of the problem is the output from the syslog, which I haven't seen before (i've had disk errors/failures before, but never this type of output). the output just repeats over and over, never changing, except for the write errors in the parity check. i've attached a shortened and the full syslog. i ran a smart test which the drive passed, so i'm rebuilding the disk right now. the hdd is a WD green drive thats been running smooth for about 1.5 years. the unraid version on my server is 5.0rc12. any ideas on why this happened/what happened?

syslog_shortened.txt

WeeboTech · June 24, 2013

you may be developing a bad spot on that disk.

Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263264/4, count: 1
Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error
Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263272/4, count: 1
Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error
Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263280/4, count: 1
Jun 23 22:18:48 ShadowOfIntent kernel: md: disk4 write error
Jun 23 22:18:48 ShadowOfIntent kernel: handle_stripe write error: 1080263288/4, count: 1

Although you did a smart test, was it a long test?

smartctl -t long

Attach the current smart test.

I would probably bring the array to an idle state without emhttp and do a smart -t long on it, then check the smart report for status.

dgaschk · June 25, 2013

Post a SMART report. The test can wait.

AeroSteveO · June 25, 2013

here's a short smart report, i started a long one however it hasn't finished yet for some reason, i'll have the long smart test done tomorrow if its still being slow

edit: just got back to my server to find that the long smart test still hadn't finished, after 8 hours of it 'running', my server is giving odd segfaults occationally when the smart command is used as well

Jun 25 15:32:36 ShadowOfIntent kernel: mdcmd (23): import 22 0,0 (unRAID engine)

Jun 25 15:32:36 ShadowOfIntent kernel: mdcmd (24): import 23 0,0 (unRAID engine)

Jun 25 15:32:36 ShadowOfIntent emhttp_event: driver_loaded (System)

Jun 25 15:32:36 ShadowOfIntent kernel: smartctl[12019]: segfault at 52f0d641 ip 080aa050 sp bfacd6a0 error 6 (Errors)

Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14316): rmmod md-mod |$stuff$ logger (Other emhttp)

Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14317): modprobe md-mod super=/boot/config/super.dat slots=24 |$stuff$ logger (unRAID engine)

Jun 25 15:32:38 ShadowOfIntent emhttp: shcmd (14318): udevadm settle (Other emhttp)

Jun 25 15:32:38 ShadowOfIntent kernel: md: unRAID driver removed (System)

Jun 25 15:32:38 ShadowOfIntent kernel: md: unRAID driver 2.1.5 installed (System)

smart.txt

AeroSteveO · June 26, 2013

After fighting with smartctl for a bit, i managed to get it to output a long smart report on the drive thats been giving me problems.

smartreportlong.txt

Joe L. · June 26, 2013

This line in the smart report indicates the disk retracted the heads upon power failure 195 times.

192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 195

You might want to check and re-seat the power cable to the drive (or any splitters involved, or drive trays involved, or backplanes)

If there is a loose connection, or an intermittent cable, you'll find the disk failing intermittently.

With the array stopped, but with the drives spinning, you might gently move the cables, you should not hear any disk spin down or retract the disk heads. (I had a splitter in my server that was improperly crimped at its connectors... nearly the same symptoms as you. It causes "hair-loss", as you'll pull your hair our trying to figure out what is happening)

Joe L.

AeroSteveO · June 26, 2013

Now that you mention it, that hdd is plugged into one of my splitters, alongside my VM drive (not included in the array), i'll plug it into a different power connection, thanks for the help, i wouldn't have noticed that

Disk write error

Recommended Posts

AeroSteveO

Link to comment

WeeboTech

Link to comment

dgaschk

Link to comment

AeroSteveO

Link to comment

AeroSteveO

Link to comment

Joe L.

Link to comment

AeroSteveO

Link to comment

Archived