New HDD spinning down randomly, I/O errors, passes tests

LumpyCustard · February 6, 2021

I recently installed a second parity drive into my unraid box as well as an extra 6TB HDD and all seems to be working fine, however upon checking over my server today I noticed that the "new" (second hand known working HGST) drive had spun down and the temperature was no longer being monitored.

I opened the drive and spun it up, then completed a short SMART test and it passed. The drive was only initialised this morning after pre-clearing and self testing so i didn't see fit to run another extended self test considering it passed.

image.png.ef429d6e1c5584538dcc464d640dc52a.png

I've manually spun up the drive and despite now running, it still looks as though something is wrong.

I checked the logs (see attached) and saw quite a bit of noise about the new drive from yesterday, however I'm unsure what's going on.

[...]
Feb  5 18:34:10 devoraid kernel: sd 10:0:3:0: [sdo] tag#183 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x00
Feb  5 18:34:10 devoraid kernel: sd 10:0:3:0: [sdo] tag#183 CDB: opcode=0x88 88 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00
Feb  5 18:34:10 devoraid kernel: print_req_error: I/O error, dev sdo, sector 0
Feb  5 18:34:10 devoraid kernel: mdcmd (11): import 10 sdo 64 5860522532 0 HGST_HUS726060ALE610_K1JP673D
Feb  5 18:34:10 devoraid kernel: md: import disk10: (sdo) HGST_HUS726060ALE610_K1JP673D size: 5860522532 
Feb  5 18:34:10 devoraid kernel: md: disk10 new disk
[...]
Feb  5 18:34:10 devoraid kernel: mdcmd (30): import 29 sdn 64 7814026532 0 WDC_WD80EFAX-68KNBN0_VGHT5R4G
Feb  5 18:34:10 devoraid kernel: md: import disk29: (sdn) WDC_WD80EFAX-68KNBN0_VGHT5R4G size: 7814026532 
[...]
Feb  5 18:34:26 devoraid emhttpd: shcmd (2459): udevadm settle
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 0, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 0, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 4, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 8, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 16, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 32, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 64, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 128, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 256, async page read
Feb  5 18:34:26 devoraid kernel: Buffer I/O error on dev md10, logical block 512, async page read
[...]
Feb  5 18:34:33 devoraid kernel: Buffer I/O error on dev md10, logical block 1465130608, async page read
[...]

A few days ago 2 brand new Toshiba 6TB drives had I/O errors, I unplugged them from my mobo's SATA ports and installed a spare PERC H310 (in IT mode) -- since then things have been going OK. The added HGST drive is not showing read errors like the Toshiba drives, and since everything is now connected to the "new" H310 things look ok, but the I/O errors are concerning.

Can anyone provide some input on what's going on? Should i just get rid of the drive?

Logs and diagnostics attached.

syslog.txt devoraid-diagnostics-20210206-1514.zip

Edited February 6, 2021 by LumpyCustard

JorgeB · February 6, 2021

The async page read errors are normal during a disk clear, it's an Unraid issue, the first controller related error not so normal, but by itself nothing to worry about for now, keep monitoring.

LumpyCustard · February 6, 2021

Thanks for the info. I'm jittery because this happened 2 days ago to two brand new (out of the foil) Toshiba drives (see screenshots below)

On Wednesday my server was offline (docker, plugins, everything was inaccessible), rebooted and by Thursday i had errors on the two Toshiba's.

I've relocated those drives from the onboard SATA ports to a new PERC H310 along with the problematic HGST.

Rebuilt my parity and verified data on the Toshiba's, whole process took over 2 days to complete due to doing it in steps.

It just looks like the same symptoms and i'm concerned i'll lose a bunch of data, which is why i bought another 8TB and went dual parity.

image.png.4bfe0a8c18ac2c98ed71594ac89e8d6e.png

Edited February 6, 2021 by LumpyCustard

JorgeB · February 6, 2021

Those are different, they were actual read errors.

New HDD spinning down randomly, I/O errors, passes tests

Recommended Posts

LumpyCustard

Link to comment

JorgeB

Link to comment

LumpyCustard

Link to comment

JorgeB

Link to comment

Join the conversation