[SOLVEDish] UnRAID reports pending sectors, SMART disagrees


weirdcrap

Recommended Posts

UnRAID v6.8.3

 

void-diagnostics-20200915-0707.zip

 

My monthly parity check on my backup server has produced read errors on disk 1 for the last two months...

 

The first time it happened there was no report of pending re-allocated sectors and the drive passed both a short and long smart self test so I wrote it off as a fluke and went on with my life.

 

This morning it happened again within minutes of the parity check starting, this time with UnRAID claiming there are 2 pending sectors:

 

image.png.c0ac7a0e3b911c087e07e0c45f85233c.png

However when I go to look at the drive stats in the context menu SMART doesn't report any pending or reallocated sectors??

image.thumb.png.ec4b88f317adea86829510e4a8deefd6.png

 

I plan on moving the drive to a different slot and see if the error follows the disk or stays with the slot.

 

Anyone ever seen UnRAID misreport pending sectors like that before? Is SMART just slow on the uptake?

 

EDIT: Swapped disk with another slot, rerunning nocorrect check now.

 

EDIT2: It appears to be following the disk, different slot, same disk with read errors. No reports of reallocated sectors this time by unraid, just read errors.

 

Sep 15 07:25:06 VOID kernel: mdcmd (57): check nocorrect
Sep 15 07:25:06 VOID kernel: md: recovery thread: check P ...
Sep 15 07:25:24 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Sep 15 07:26:20 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Sep 15 07:28:31 VOID ntpd[1859]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 Sense Key : 0x3 [current] 
Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 ASC=0x11 ASCQ=0x0 
Sep 15 07:28:52 VOID kernel: sd 10:0:0:0: [sdp] tag#3130 CDB: opcode=0x88 88 00 00 00 00 00 01 ba 94 b0 00 00 04 00 00 00
Sep 15 07:28:52 VOID kernel: print_req_error: critical medium error, dev sdp, sector 29005968
Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005904
Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005912
Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005920
Sep 15 07:28:52 VOID kernel: md: disk1 read error, sector=29005928
Sep 15 07:29:16 VOID kernel: sd 10:0:0:0: attempting task abort! scmd(00000000ee3221de)
Sep 15 07:29:16 VOID kernel: sd 10:0:0:0: [sdp] tag#3104 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e5 00
Sep 15 07:29:16 VOID kernel: scsi target10:0:0: handle(0x0009), sas_address(0x4433221104000000), phy(4)
Sep 15 07:29:16 VOID kernel: scsi target10:0:0: enclosure logical id(0x5c81f660e69c9f00), slot(7) 
Sep 15 07:29:17 VOID kernel: sd 10:0:0:0: task abort: SUCCESS scmd(00000000ee3221de)
Sep 15 07:29:17 VOID kernel: sd 10:0:0:0: Power-on or device reset occurred
Sep 15 07:29:22 VOID kernel: sd 10:0:0:0: Power-on or device reset occurred
Sep 15 07:29:34 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:29:34 VOID kernel: mpt2sas_cm1: log_info(0x31080000): originator(PL), code(0x08), sub_code(0x0000)
Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 Sense Key : 0x3 [current] 
Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 ASC=0x11 ASCQ=0x0 
Sep 15 07:29:34 VOID kernel: sd 10:0:0:0: [sdp] tag#3105 CDB: opcode=0x88 88 00 00 00 00 00 01 ba c8 b0 00 00 04 00 00 00
Sep 15 07:29:34 VOID kernel: print_req_error: critical medium error, dev sdp, sector 29019160
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019096
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019104
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019112
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019120
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019128
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019136
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019144
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019152
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019160
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019168
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019176
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019184
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019192
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019200
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019208
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019216
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019224
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019232
Sep 15 07:29:34 VOID kernel: md: disk1 read error, sector=29019240
Sep 15 07:29:39 VOID rc.diskinfo[12312]: SIGHUP received, forcing refresh of disks info.
Sep 15 07:30:57 VOID emhttpd: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog

 

EDIT: "solved" per say. I know the drive is dying though I do find the ghost re-allocated sectors reported strange. I have a new one on order.

Edited by weirdcrap
Link to comment
Just now, JorgeB said:

Don't known why it's reporting pending sectors, did it report they changed to 0 after?

 

The disk does appear to be failing, though.

Nope, I got two notifications from UnRAID, one about the read errors and a second about the 2 pending sectors. I went to check the SMART stats and saw the discrepancy, canceled the check and shut down the server to swap the disk with another.

 

The diag file posted above is from before I shut down the server and after the alerts were generated.

 

image.thumb.png.8a631680cd1f49ad830c79d0fed739e9.png

 

I'll let the parity check finish and order a replacement disk. I missed my warranty window by about 6 months =(

Link to comment

Hi,

 

Pending sectors are considered 'suspect' by the drive.

 

If the drive subsequently reads OK from them, the pending count will reduce, effectively they will disappear.

If the drive continues to be unable to read from them, they will be remapped. In this case the pending counter is reset to zero and the remapped counter incremented.

 

 Looks like your drive is starting to fail, but not consistantly enough for the drive to remap the sector as yet.

 

https://kb.acronis.com/content/9133

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.