Jump to content

Disks with read errors


Stubbs

Recommended Posts

I noticed these errors started appearing in my log. The reads and writes on my array tab are all incrementing, except for Disk 3 which is remaining completely static at 1,375,745 reads and 581 writes.

 

f1jzkj8.png

 

At first only disk0(parity drive) had errors, but now multiple disks do.

 

PiNTKl9.png

 

 


Sep  6 23:24:45 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x3fc00 SErr 0x0 action 0x0
Sep  6 23:24:45 Tower kernel: ata6.00: irq_stat 0x40000008
Sep  6 23:24:45 Tower kernel: ata6.00: failed command: READ FPDMA QUEUED
Sep  6 23:24:45 Tower kernel: ata6.00: cmd 60/08:50:f8:8c:37/04:00:26:00:00/40 tag 10 ncq dma 528384 in
Sep  6 23:24:45 Tower kernel:         res 41/40:00:f8:8c:37/00:00:26:00:00/40 Emask 0x409 (media error) <F>
Sep  6 23:24:45 Tower kernel: ata6.00: status: { DRDY ERR }
Sep  6 23:24:45 Tower kernel: ata6.00: error: { UNC }
Sep  6 23:24:45 Tower kernel: ata6.00: ATA Identify Device Log not supported
Sep  6 23:24:45 Tower kernel: ata6.00: ATA Identify Device Log not supported
Sep  6 23:24:45 Tower kernel: ata6.00: configured for UDMA/133
Sep  6 23:24:45 Tower kernel: sd 6:0:0:0: [sde] tag#10 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=7s
Sep  6 23:24:45 Tower kernel: sd 6:0:0:0: [sde] tag#10 Sense Key : 0x3 [current] 
Sep  6 23:24:45 Tower kernel: sd 6:0:0:0: [sde] tag#10 ASC=0x11 ASCQ=0x4 
Sep  6 23:24:45 Tower kernel: sd 6:0:0:0: [sde] tag#10 CDB: opcode=0x88 88 00 00 00 00 00 26 37 8c f8 00 00 04 08 00 00
Sep  6 23:24:45 Tower kernel: blk_update_request: I/O error, dev sde, sector 641174776 op 0x0:(READ) flags 0x0 phys_seg 129 prio class 0
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174712
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174720
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174728
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174736
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174744
Sep  6 23:24:45 Tower kernel: md: disk0 read error, sector=641174752

 

And they're continuing in intervals.

 

Sep  6 23:41:25 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x3fe8 SErr 0x0 action 0x0
Sep  6 23:41:25 Tower kernel: ata6.00: irq_stat 0x40000008
Sep  6 23:41:25 Tower kernel: ata6.00: failed command: READ FPDMA QUEUED
Sep  6 23:41:25 Tower kernel: ata6.00: cmd 60/40:18:b8:54:b2/05:00:93:00:00/40 tag 3 ncq dma 688128 in
Sep  6 23:41:25 Tower kernel:         res 41/40:00:b8:54:b2/00:00:93:00:00/40 Emask 0x409 (media error) <F>
Sep  6 23:41:25 Tower kernel: ata6.00: status: { DRDY ERR }
Sep  6 23:41:25 Tower kernel: ata6.00: error: { UNC }
Sep  6 23:41:25 Tower kernel: ata6.00: ATA Identify Device Log not supported
Sep  6 23:41:25 Tower kernel: ata6.00: ATA Identify Device Log not supported
Sep  6 23:41:25 Tower kernel: ata6.00: configured for UDMA/133
Sep  6 23:41:25 Tower kernel: sd 6:0:0:0: [sde] tag#3 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=7s
Sep  6 23:41:25 Tower kernel: sd 6:0:0:0: [sde] tag#3 Sense Key : 0x3 [current] 
Sep  6 23:41:25 Tower kernel: sd 6:0:0:0: [sde] tag#3 ASC=0x11 ASCQ=0x4 
Sep  6 23:41:25 Tower kernel: sd 6:0:0:0: [sde] tag#3 CDB: opcode=0x88 88 00 00 00 00 00 93 b2 54 b8 00 00 05 40 00 00
Sep  6 23:41:25 Tower kernel: blk_update_request: I/O error, dev sde, sector 2477937848 op 0x0:(READ) flags 0x0 phys_seg 168 prio class 0
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937784
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937792
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937800
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937808
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937816
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937824
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937832
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937840
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937848
Sep  6 23:41:25 Tower kernel: md: disk0 read error, sector=2477937856

 

Sep  7 00:00:48 Tower kernel: ata9.00: cmd 60/08:18:68:71:11/05:00:9e:00:00/40 tag 3 ncq dma 659456 in
Sep  7 00:00:48 Tower kernel:         res 41/40:00:68:71:11/00:00:9e:00:00/40 Emask 0x409 (media error) <F>
Sep  7 00:00:48 Tower kernel: ata9.00: status: { DRDY ERR }
Sep  7 00:00:48 Tower kernel: ata9.00: error: { UNC }
Sep  7 00:00:48 Tower kernel: ata9.00: ATA Identify Device Log not supported
Sep  7 00:00:48 Tower kernel: ata9.00: ATA Identify Device Log not supported
Sep  7 00:00:48 Tower kernel: ata9.00: configured for UDMA/133
Sep  7 00:00:48 Tower kernel: sd 9:0:0:0: [sdf] tag#3 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=7s
Sep  7 00:00:48 Tower kernel: sd 9:0:0:0: [sdf] tag#3 Sense Key : 0x3 [current] 
Sep  7 00:00:48 Tower kernel: sd 9:0:0:0: [sdf] tag#3 ASC=0x11 ASCQ=0x4 
Sep  7 00:00:48 Tower kernel: sd 9:0:0:0: [sdf] tag#3 CDB: opcode=0x28 28 00 9e 11 71 68 00 05 08 00
Sep  7 00:00:48 Tower kernel: blk_update_request: I/O error, dev sdf, sector 2651943272 op 0x0:(READ) flags 0x0 phys_seg 161 prio class 0
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943208
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943216
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943224
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943232
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943240
Sep  7 00:00:48 Tower kernel: md: disk2 read error, sector=2651943248

 

Diagnostics attached.

Also attached a SMART test for the parity drive, although it got stuck on 90% and won't complete. Says "Interrupted (host reset)".

 

For what it's worth, a couple of weeks ago I had some big problems with power failures and didn't have a UPS working at the time.

tower-diagnostics-20220906-1405.zip (disk0) tower-smart-20220906-2339.zip

Edited by Stubbs
Link to comment
  • Stubbs changed the title to Disks with read errors
4 minutes ago, trurl said:

No point doing anything until you resolve your hardware problems.

 

Do these drives share a power cable? Parity is on one controller but disks 2,3 on another.

I have a slightly unorthodox setup. I believe disks 2 & 3 are installed within my servers case, whereas the parity disk + disks 1 and 4 are inside the hotswap bay connected to the front of the case. This bay (and its fan) is powered by two SATA power connectors.

I've had to interchange them over the years because of past problems. One time I had a defective cable, another time one of the hotswap bays had the wrong mounting screws, causing a faulty connection. Never really kept track of where each specific disk is because Unraid remembers their IDs anyway.

 

7 minutes ago, JorgeB said:

They are logged as actual disk errors, though possibly only parity has a problem, run extended SMART test on all, make sure spin down is disabled.

The thing is, I can't (or rather, couldn't) even complete the short test. It got stuck at 90% before showing the message "Interrupted (host reset)".

 

 

In an attempt to fix this, I restarted the server, and the read errors have gone away. I assume they're still actually there though, and I'll try to run an extended test.

First I'm backing everything important up.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...