Jump to content

Disk keeps getting disabled, but different disks logging syslog errors


Recommended Posts

I've got a disk that getting disabled, but syslog has errors for different disks.  I'm trying to determine which disk is the real problem.  Did I actually lose multiple at the same time?  Could I possibly have a sata controller issue on my motherboard?

 

I feel confident I've got some type of hardware issue, but can't figure out which part has failed.  Any help is greatly appreciated.  

 

Details:
- sdj gets disabled
- syslog has errors for sdg and sdk.  these are sata ssd cache disks.
- smart report for all 3 disks is clean.  however i have to reboot before i can access smart data for sdj.
- I can add sdj back to the array, but the issue happens again in a few days.
- all 3 disks are plugged into sata ports on my motherboard.
- I'm attaching syslog (collected before reboot), diagnostic files, and smart logs.
- The first time the issue happened I had an extra issue.  cpu hit 100% and something went wrong with parity & emulation.  data on sdj was not emulated.  Rebooting the system fixed this cpu/parity/emulation issue. This behavior did not happen again on future instances of sdj getting disabled.
- Unraid version 6.10.2.  I was on 6.10 rc4 when the issue started, but i upgraded as part of troubleshooting.
- I have a full backup.  I can restore if for some reason multiple disks need to be replaced or troubleshooting results in data lose.

smart_report_sdg_cache_disk.txt smart_report_sdk_cache_disk.txt syslog.txt smart_report_sdj_array_disk.txt thor-diagnostics-20220530-2010.zip

Link to comment

Diagnostics already includes syslog since reboot and SMART for all attached disks.

 

The 2 SMART reports you have posted for cache are the same disk (disk "letters" are not significant between reboots or if a drive disconnects). Looks like the other cache is actually sdb in those diagnostics, but it doesn't appear assigned, probably since it disconnected / reconnected. Maybe it shows as an unassigned device now.

 

06:00.0 SATA controller [0106]: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215] (rev 11)
	Subsystem: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller [1b4b:9215]
	Kernel driver in use: ahci

Marvell controllers are NOT recommended.  Do you have enough ports without that? Could be the source of your problems.

 

All drives appear to be mounted including cache even without the other disk for the pool. Some corruption on cache though.

 

Unrelated, but your system share has files on the array. Also, why 60G docker.img? Have you had problems filling it?

 

 

Link to comment

Thanks for the reply.  Unfortunately I do not have enough ports without the Marvell controller, but I think I have a sata expansion card I can add to the system.  I'll give that a try and see if it helps. 

 

Never even considered the sata controller on the motherboard when i built this system a year ago.  Definitely something i'll think about the future.  If i have to buy a new motherboard or sata card, is there a recommended controller I should look for?

 

At the moment both cache disks appear to be working correctly.  Neither are unassigned.

 

60G docker.img.....yes, i had space issues.  I forget what i was doing when i hit that issue.  It's been a while.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...