PandaGod Posted May 1, 2023 Share Posted May 1, 2023 (edited) Today my scheduled parity check ran which completed without any errors found. However I did get a "Unraid array errors" message for three of my drives with the following info. Parity disk - (sdd) (errors 190) Disk 1 - (sdc) (errors 197) Disk 2 - (sde) (errors 183) I have attached my logs below it seems to keep repeating the following line which has only shown up today. program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO As there seems to be multiple disk related issues I am not sure where I should begin to try to solve this issue. nas-syslog-20230501-2112.zip Edited May 1, 2023 by PandaGod reason for posting question added Quote Link to comment
JorgeB Posted May 2, 2023 Share Posted May 2, 2023 Looks like a possible controller issue but please post the diagnostics. Quote Link to comment
PandaGod Posted May 2, 2023 Author Share Posted May 2, 2023 Thanks for the response. I have since rebooted the machine so I only have the logs since that boot. It was initially working but I am getting the following showing up on my drive logs. May 2 15:01:20 NAS kernel: ata5.00: exception Emask 0x50 SAct 0xc000 SErr 0x4090800 action 0xe frozen May 2 15:01:20 NAS kernel: ata5.00: irq_stat 0x00400040, connection status changed May 2 15:01:20 NAS kernel: ata5: SError: { HostInt PHYRdyChg 10B8B DevExch } May 2 15:01:20 NAS kernel: ata5.00: failed command: READ FPDMA QUEUED May 2 15:01:20 NAS kernel: ata5.00: cmd 60/80:70:00:24:2f/01:00:79:00:00/40 tag 14 ncq dma 196608 in May 2 15:01:20 NAS kernel: ata5.00: status: { DRDY } May 2 15:01:20 NAS kernel: ata5.00: failed command: READ FPDMA QUEUED May 2 15:01:20 NAS kernel: ata5.00: cmd 60/c8:78:58:0a:72/00:00:74:00:00/40 tag 15 ncq dma 102400 in May 2 15:01:20 NAS kernel: ata5.00: status: { DRDY } May 2 15:01:20 NAS kernel: ata5: hard resetting link May 2 15:01:25 NAS kernel: ata5: link is slow to respond, please be patient (ready=0) May 2 15:01:27 NAS kernel: ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) May 2 15:01:27 NAS kernel: ata5.00: ACPI cmd f5/00:00:00:00:00:00(SECURITY FREEZE LOCK) filtered out May 2 15:01:27 NAS kernel: ata5.00: ACPI cmd b1/c1:00:00:00:00:00(DEVICE CONFIGURATION OVERLAY) filtered out May 2 15:01:27 NAS kernel: ata5.00: ACPI cmd f5/00:00:00:00:00:00(SECURITY FREEZE LOCK) filtered out May 2 15:01:27 NAS kernel: ata5.00: ACPI cmd b1/c1:00:00:00:00:00(DEVICE CONFIGURATION OVERLAY) filtered out May 2 15:01:27 NAS kernel: ata5.00: configured for UDMA/133 May 2 15:01:27 NAS kernel: ata5: EH complete nas-diagnostics-20230502-1511.zip Quote Link to comment
itimpi Posted May 2, 2023 Share Posted May 2, 2023 Those sort of errors in the logs typically mean connection issues such as power or SATA cabling to the drive 1 Quote Link to comment
JorgeB Posted May 2, 2023 Share Posted May 2, 2023 Controller issues with Ryzen boards were quite common, not so much with v6.11.5 but I still see them from time to time, you could try updating to v6.12, newer kernel might help, there's also a chance it could be a PSU problem, if nothing helps using an ad-don controller would be the best bet. Quote Link to comment
PandaGod Posted May 2, 2023 Author Share Posted May 2, 2023 I am currently running on a ASUS Z370 Strix mini mobo, I was looking to get an HBA anyway for more drives so I will try that out. I have been hearing some horrible noises from the HDDs when those errors show up. Quote Link to comment
JorgeB Posted May 2, 2023 Share Posted May 2, 2023 9 minutes ago, PandaGod said: I am currently running on a ASUS Z370 Yeah, sorry, somehow saw that as a X370, so ignore all the Ryzen stuff, in that case most likely would be a PSU problem since it's affecting all disks at the same time, or the power connectors, if they are using a splitter or similar. Quote Link to comment
PandaGod Posted May 2, 2023 Author Share Posted May 2, 2023 Okay in which case I will try changing the PSU first before I go down the HBA route. Thanks for the help Quote Link to comment
PandaGod Posted May 3, 2023 Author Share Posted May 3, 2023 I tried isolating the issue, by removing my hdd backplane and directly wiring up my hdds I am now unable to get all the drives to mount correctly. I am getting an "Unmountable disk present" on Disk 2 (sde) nas-diagnostics-20230503-0914.zip Quote Link to comment
PandaGod Posted May 3, 2023 Author Share Posted May 3, 2023 I ran the Disk test and these were the results I don't see anything that would indicate a success or failure Phase 1 - find and verify superblock... - block cache size set to 720848 entries Phase 2 - using internal log - zero log... zero_log: head block 871803 tail block 871781 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_fdblocks 657257137, counted 664072051 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... Maximum metadata LSN (5:873552) is ahead of log (5:871803). Would format log to cycle 8. No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Wed May 3 09:26:43 2023 Phase Start End Duration Phase 1: 05/03 09:24:02 05/03 09:24:02 Phase 2: 05/03 09:24:02 05/03 09:24:07 5 seconds Phase 3: 05/03 09:24:07 05/03 09:25:31 1 minute, 24 seconds Phase 4: 05/03 09:25:31 05/03 09:25:32 1 second Phase 5: Skipped Phase 6: 05/03 09:25:32 05/03 09:26:43 1 minute, 11 seconds Phase 7: 05/03 09:26:43 05/03 09:26:43 Total run time: 2 minutes, 41 seconds So I am not too sure what I need to do next Quote Link to comment
JorgeB Posted May 3, 2023 Share Posted May 3, 2023 You need to run it again without -n or nothing will be done. Quote Link to comment
Solution PandaGod Posted June 7, 2023 Author Solution Share Posted June 7, 2023 The issue was a faulty power supply. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.