Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Disabled parity disk, stuck "reading" at 2.5GB/s

Featured Replies

So I upgraded my parity disk to a 16TB Seagate Exos about a week ago, along with expanding my array and adding another cache drive. I both pre-cleared w/ pre-read and post-read, as well as did an extended SMART test on the drive to ensure it was healthy before rebuilding parity on it and also added a second 16TB Exos drive to the array.

 

Tonight I saw a notification that my parity disk is disabled, and checking the unraid dashboard is says it's "reading" at 2.5-3GB/s and had accumulated billions of errors

parity_stuck_read_rate.thumb.PNG.a0b559334f0c803f8b7b1084dbd92abc.PNG

 

Here is the section of log where the drive gets disabled:

May 11 19:26:34 Tower  emhttpd: spinning down /dev/sdh
May 11 19:29:16 Tower kernel: ata3.00: exception Emask 0x0 SAct 0x190000 SErr 0x0 action 0x6 frozen
May 11 19:29:16 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
May 11 19:29:16 Tower kernel: ata3.00: cmd 61/c0:80:80:34:dd/00:00:8c:00:00/40 tag 16 ncq dma 98304 out
May 11 19:29:16 Tower kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
May 11 19:29:16 Tower kernel: ata3.00: status: { DRDY }
May 11 19:29:16 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
May 11 19:29:16 Tower kernel: ata3.00: cmd 61/40:98:40:35:dd/05:00:8c:00:00/40 tag 19 ncq dma 688128 out
May 11 19:29:16 Tower kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
May 11 19:29:16 Tower kernel: ata3.00: status: { DRDY }
May 11 19:29:16 Tower kernel: ata3.00: failed command: WRITE FPDMA QUEUED
May 11 19:29:16 Tower kernel: ata3.00: cmd 61/40:a0:80:3a:dd/05:00:8c:00:00/40 tag 20 ncq dma 688128 out
May 11 19:29:16 Tower kernel:         res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
May 11 19:29:16 Tower kernel: ata3.00: status: { DRDY }
May 11 19:29:16 Tower kernel: ata3: hard resetting link
May 11 19:29:16 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
May 11 19:29:21 Tower kernel: ata3.00: qc timeout (cmd 0xec)
May 11 19:29:21 Tower kernel: ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
May 11 19:29:21 Tower kernel: ata3.00: revalidation failed (errno=-5)
May 11 19:29:21 Tower kernel: ata3: hard resetting link
May 11 19:29:22 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
May 11 19:29:32 Tower kernel: ata3.00: qc timeout (cmd 0xec)
May 11 19:29:32 Tower kernel: ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
May 11 19:29:32 Tower kernel: ata3.00: revalidation failed (errno=-5)
May 11 19:29:32 Tower kernel: ata3: limiting SATA link speed to 3.0 Gbps
May 11 19:29:32 Tower kernel: ata3: hard resetting link
May 11 19:29:32 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
May 11 19:30:03 Tower kernel: ata3.00: qc timeout (cmd 0xec)
May 11 19:30:03 Tower kernel: ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
May 11 19:30:03 Tower kernel: ata3.00: revalidation failed (errno=-5)
May 11 19:30:03 Tower kernel: ata3.00: disable device
May 11 19:30:03 Tower kernel: ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
May 11 19:30:03 Tower kernel: ata3: EH complete
May 11 19:30:03 Tower kernel: sd 3:0:0:0: [sdc] tag#2 UNKNOWN(0x2003) Result: hostbyte=0x04 driverbyte=DRIVER_OK cmd_age=77s
May 11 19:30:03 Tower kernel: sd 3:0:0:0: [sdc] tag#2 CDB: opcode=0x8a 8a 00 00 00 00 00 8c dd 3a 80 00 00 05 40 00 00
May 11 19:30:03 Tower kernel: I/O error, dev sdc, sector 2363308672 op 0x1:(WRITE) flags 0x0 phys_seg 168 prio class 0
May 11 19:30:03 Tower kernel: md: disk0 write error, sector=2363308608
May 11 19:30:03 Tower kernel: md: disk0 write error, sector=2363308616

 

How should I proceed?  For now, I have just tried to stop the array, but it seems like it hung, the "stop" button is still greyed out

 

My `syslog` share for syslog server has an 85GB syslog file in it, so I think it's been spamming errors into the log file for several hours.  Coudl that be what's preventing the array from stopping?

tower-diagnostics-20230511-2341.zip

Edited by veri745

Solved by veri745

  • Author

My array now seems to be is some weird limbo state. Some places (like the Docker tab) say the array needs to be started, but the share filesystem still seems to be navigable from the web UI and console.  

 

There's a `rsyslogd` process that is chewing up ~200% of a cpu, although it appears the 85GB syslog file is no longer growing.

 

Should I try restart and hope for the best?, or should I try to kill some processes first to make sure the array is shut down?

Edited by veri745

  • Author

Interestingly, the drives besides parity that are reporting errors (disk2 and 4), are the same drives that were reporting errors a couple months ago:

disk_error_notificaiton.thumb.PNG.789ca4f1b15ab15520251675b24d89f9.PNG

 

  • Community Expert

All 3 disks connected to this controller are having issues:

 

05:00.0 SATA controller [0106]: ASMedia Technology Inc. Device [1b21:1064] (rev 02)
    Subsystem: ZyDAS Technology Corp. Device [2116:2116]
    Kernel driver in use: ahci
    Kernel modules: ahci

 

Could be a bad controller, you can also try a different PCIe slot if available.

  • Author
6 hours ago, JorgeB said:

All 3 disks connected to this controller are having issues:

 

05:00.0 SATA controller [0106]: ASMedia Technology Inc. Device [1b21:1064] (rev 02)
    Subsystem: ZyDAS Technology Corp. Device [2116:2116]
    Kernel driver in use: ahci
    Kernel modules: ahci

 

Could be a bad controller, you can also try a different PCIe slot if available.

Controller is brand-new (from since linked thread). Of course it could be the controller, but seems unlikely given the similar scenario I was in a couple months ago before I got it.

 

My immediately concern is how to deal with my system state as it stands. The array is currently still stuck "Stopping..."

I killed the `rsyslogd` process so it would stop spamming writes to my cache  disk.

The web UI is still responsive, but it won't response to a "reboot" or `powerdown -r` from the terminal

 

Do I hard-reset it?

Edited by veri745

  • Author

Alright, forced a hard reset.  System rebooted, started array in maintenance mode, ran filesystem checks and smart tests on the disks that had errors

 

Nothing obvious found.

 

 

tower-diagnostics-20230512-0906.zip

  • Author
  • Solution

So I pulled open my server's case tonight to swap out to all-new SATA cables and to move my pci-e SATA card to a different port.

 

I discovered something interesting that may point to a potential failure-mode that I've been experiencing:

My parity drive and disks 2 and 4, the disks that had errors on them, are in a 4-bay hot-swappable drive cage, and they're also all connected to my 4-port SATA card.

One of the molex power connectors that powers the hotswap cage backplane board had come loose, so it was being powered by only two molex connectors instead of three.

 

I'm thinking that under certain load conditions, there wasn't enough juice getting to the drives in that drive cage

  • Community Expert
On 5/14/2023 at 4:05 AM, veri745 said:

One of the molex power connectors that powers the hotswap cage backplane board had come loose, so it was being powered by only two molex connectors instead of three.

 

I'm thinking that under certain load conditions, there wasn't enough juice getting to the drives in that drive cage

That's a strong possibility, if that's another thing all disks have in common other than the controller

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.