Jump to content

Parity Disk Failed


maric

Recommended Posts

So...

 

I've been away from home for a week and on return noticed that my 6.12.3 system is reporting the following and that the parity disk is offline. I've checked the logs and found (from my own novice checks) that the disk looks to have had a hardware error, is that right?

 

Kind of annoying considering it's only about 2 months old and still under warranty.

 

before I do any shutdown/reboots etc I just wanted to check with the experts what my next steps should be as I'm thinking I'll have to RMA the faulty disk and when a replacement arrives re-install and let the parity rebuild.

 

Am I ok to carry on using the (unprotected) system in the meantime, it's only really a media server? Is it ok to stop the array, pull the faulty physical disk and restart, I could use a spare 1.92 SSD as a temp parity but I was under the impression that spinners were better than SSD's hence my reason for the 4Gb in the first place?

 

Thanks in advance.

 

 

Diagnostics attached.

 

Sep 29 19:03:10 Borg kernel: hpsa 0000:03:00.0: scsi 1:0:8:0: resetting physical  Direct-Access     ATA      ST4000LM024-2AN1 PHYS DRV SSDSmartPathCap- En- Exp=1
Sep 29 19:03:16 Borg kernel: hpsa 0000:03:00.0: invalid command: LUN:0000000000800701 CDB:01030000000000000000000000000000
Sep 29 19:03:16 Borg kernel: hpsa 0000:03:00.0: probably means device no longer present
Sep 29 19:03:16 Borg kernel: hpsa 0000:03:00.0: scsi 1:0:8:0: reset physical  failed Direct-Access     ATA      ST4000LM024-2AN1 PHYS DRV SSDSmartPathCap- En- Exp=1
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: Device offlined - not ready after error recovery
### [PREVIOUS LINE REPEATED 2 TIMES] ###
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#883 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=65s
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#883 CDB: opcode=0x35 35 00 00 00 00 00 00 00 00 00
Sep 29 19:03:16 Borg kernel: I/O error, dev sdi, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#884 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=65s
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#884 CDB: opcode=0x88 88 00 00 00 00 00 70 0c 9d 80 00 00 00 08 00 00
Sep 29 19:03:16 Borg kernel: I/O error, dev sdi, sector 1879874944 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=1879874880
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#885 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=65s
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: [sdi] tag#885 CDB: opcode=0x8a 8a 00 00 00 00 00 70 0c 9d 78 00 00 00 08 00 00
Sep 29 19:03:16 Borg kernel: I/O error, dev sdi, sector 1879874936 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
Sep 29 19:03:16 Borg kernel: md: disk0 write error, sector=1879874872
Sep 29 19:03:16 Borg kernel: sd 1:0:8:0: rejecting I/O to offline device
Sep 29 19:03:16 Borg kernel: I/O error, dev sdi, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 2
Sep 29 19:03:16 Borg kernel: I/O error, dev sdi, sector 955866864 op 0x0:(READ) flags 0x0 phys_seg 16 prio class 2
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866800
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866808
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866816
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866824
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866832
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866840
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866848
Sep 29 19:03:16 Borg kernel: md: disk0 read error, sector=955866856

 

image.png.c8e7f0679c61998a5b38747a490a9cda.png

 

 

borg-diagnostics-20231004-1135.zip

Edited by maric
Link to comment

Oh, ok thanks.

 

I'll spin down and do just that but FYI this disk is in a hot plug 8 Bay HP Cage connected by 2 x SAS cables to an HP H240 controller so there are at least three other disks on the same cable. I'll do a reseat of the caddy as well though as that won't hurt.

 

I'll be back with the SMART in a few minutes.

Link to comment
1 minute ago, maric said:

The SMART report looks ok (I think) but system still shows disk as failed.

 

Technically the disk is shown as 'Disabled' rather than failed.    This will happen if a write to it fails for any reason so that the data disks and parity are no longer in sync.   You need to rebuild the disk to clear the disabled status.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...