Pjhal Posted May 18, 2021 Share Posted May 18, 2021 (edited) As the title says. Screen shot shows the discs, diagnostics logs included. Also: Wen using crusader i cannot access disk 3. /mnt/disk 3 is somehow a file of zero bytes and not a folder silverstone-diagnostics-20210518-2122.zip I put Unraid in maintenance mode and ran a short smart test on Disk 3 it completed with no errors. Disk 3 XFS check with -n ********************************************************************************************************** Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_fdblocks 456165658, counted 458312794 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 6 - agno = 4 - agno = 7 - agno = 5 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. ********************************************************************************************************** Edit: i added the SMART diagnostics of all 5 disks with errors after running short SMART on all of them. Disk numbers appended( Disk 1 etc). Disk 3 is also running the extensive SMART check atm. WDC_WD80EMAZ-00W_7HKJT7EJ_35000cca257f1e771-20210518-2240 - Disk 3.txt WDC_WD80EMAZ-00W_7HKJWUXJ_35000cca257f1f4f1-20210518-2244 - Disk 4.txt WDC_WD80EZAZ-11T_2SG8U7JJ_35000cca27dc401ba-20210518-2245 Disk 7.txt WDC_WD80EZAZ-11T_2SG9465F_35000cca27dc4271a-20210518-2244 Disk 6.txt WDC_WD80EZAZ-11T_7HJJ6AVF_35000cca257e38cc8-20210518-2243 Disk 1.txt What should i do ? How bad is this? Edited May 24, 2021 by Pjhal Quote Link to comment
JorgeB Posted May 19, 2021 Share Posted May 19, 2021 Don't see any controller issues logged, so most likely a power/connection problem, power down the server, check all connections and power back up, array should be accessible after that. 1 Quote Link to comment
Pjhal Posted May 19, 2021 Author Share Posted May 19, 2021 13 hours ago, JorgeB said: Don't see any controller issues logged, so most likely a power/connection problem, power down the server, check all connections and power back up, array should be accessible after that. Thank you for your response. I have rebooted, Unraid then reported zero errors. Then i started the array in maintenance mode, now doing a Parity check (read only). After that ill try starting the array normally. Quote Link to comment
Pjhal Posted May 20, 2021 Author Share Posted May 20, 2021 (edited) Oke it got worse i finished the Parity check with no errors and then tried to start the array normally now i have 6 unmountable Disks. That is every Data Disk except Disk 5... Edit: i included new diagnostics silverstone-diagnostics-20210520-2253.zip Edited May 20, 2021 by Pjhal Quote Link to comment
TechTitus Posted May 20, 2021 Share Posted May 20, 2021 (edited) 24 minutes ago, Pjhal said: Oke it got worse i finished the Parity check with no errors and then tried to start the array normally now i have 6 unmountable Disks. That is every Data Disk except Disk 5... Edit: i included new diagnostics silverstone-diagnostics-20210520-2253.zip 117.66 kB · 0 downloads Same Issues I'm having. Are these shucked drives? Edited May 20, 2021 by TechTitus Quote Link to comment
Pjhal Posted May 20, 2021 Author Share Posted May 20, 2021 Yes they are, but the Disks them selves are fine according to SMART. This happened after upgrading to 6.9.2 and then downgrading again to 6.8.3. So i am hoping that it is just some limited file inconsistency. And not a mayor failure of hard drives or the whole array. Quote Link to comment
TechTitus Posted May 20, 2021 Share Posted May 20, 2021 2 minutes ago, Pjhal said: Yes they are, but the Disks them selves are fine according to SMART. This happened after upgrading to 6.9.2 and then downgrading again to 6.8.3. So i am hoping that it is just some limited file inconsistency. And not a mayor failure of hard drives or the whole array. Yep, I'm having the exact same issue and UDMA CRC errors as well. I'm going to swap Power Supplies to see if it's a power issue. Quote Link to comment
JorgeB Posted May 21, 2021 Share Posted May 21, 2021 Read errors on multiple disks: May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=24 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=24 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=24 May 20 22:48:29 Silverstone kernel: Buffer I/O error on dev md1, logical block 0, async page read ### [PREVIOUS LINE REPEATED 1 TIMES] ### May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=32 May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=40 May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=48 This is a likely a power, connection or controller problem. 1 Quote Link to comment
Pjhal Posted May 21, 2021 Author Share Posted May 21, 2021 (edited) 5 hours ago, JorgeB said: Read errors on multiple disks: May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk4 read error, sector=24 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk7 read error, sector=24 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=8 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=16 May 20 22:48:29 Silverstone kernel: md: disk6 read error, sector=24 May 20 22:48:29 Silverstone kernel: Buffer I/O error on dev md1, logical block 0, async page read ### [PREVIOUS LINE REPEATED 1 TIMES] ### May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=32 May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=40 May 20 22:48:29 Silverstone kernel: md: disk1 read error, sector=48 This is a likely a power, connection or controller problem. But this issue happened after downgrading from 6.92 back to 6.83 nothing else changed. I also read that some people had compatibility issues with the newer version. I use a: https://www.broadcom.com/products/storage/host-bus-adapters/sas-9300-8i What can i do to fix this? I understand that it is hypothetically possible that my power supply failed or that it is a cable failure but it seems incredibly unlikely to me that this happens at the exact time that that i run into OS issues due to updating and downgrading my OS version. Edit: oke i disconnected and reconnected the HBA and my array is back so maybe it was a badly plugged in connect? Edited May 21, 2021 by Pjhal Quote Link to comment
JorgeB Posted May 21, 2021 Share Posted May 21, 2021 32 minutes ago, Pjhal said: But this issue happened after downgrading from 6.92 back to 6.83 nothing else changed. It's still a hardware issue. 1 Quote Link to comment
JorgeB Posted May 21, 2021 Share Posted May 21, 2021 33 minutes ago, Pjhal said: Edit: oke i disconnected and reconnected the HBA and my array is back so maybe it was a badly plugged in connect? Missed the edit, possibly. 1 Quote Link to comment
Pjhal Posted May 21, 2021 Author Share Posted May 21, 2021 3 minutes ago, JorgeB said: Missed the edit, possibly. Thank you for your responses btw! How should i handle the 22 errors that 6 Disks are reporting? Quote Link to comment
JorgeB Posted May 21, 2021 Share Posted May 21, 2021 Rebooting will clear them. 1 Quote Link to comment
Pjhal Posted May 21, 2021 Author Share Posted May 21, 2021 3 hours ago, JorgeB said: Rebooting will clear them. As far as i can tell, everything seems to be normal and working again. thx again 1 Quote Link to comment
Pjhal Posted May 21, 2021 Author Share Posted May 21, 2021 5 hours ago, Pjhal said: As far as i can tell, everything seems to be normal and working again. thx again Well the errors are back again now on Disk 6 and 7. silverstone-diagnostics-20210522-0008.zip Quote Link to comment
JorgeB Posted May 22, 2021 Share Posted May 22, 2021 Still looks like a power/connection issue. 1 Quote Link to comment
Pjhal Posted May 22, 2021 Author Share Posted May 22, 2021 9 hours ago, JorgeB said: Still looks like a power/connection issue. Shutdown server, re plugged HBA and all Disks. Then started it up again. After some time new errors Quote May 22 18:02:58 Silverstone kernel: mdcmd (58): spindown 7 May 22 18:09:15 Silverstone kernel: mdcmd (59): spindown 6 May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 Sense Key : 0x5 [current] May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 ASC=0x20 ASCQ=0x0 May 22 18:15:53 Silverstone kernel: sd 13:0:6:0: [sdh] tag#1409 CDB: opcode=0x88 88 00 00 00 00 01 0b b7 0b 50 00 00 00 08 00 00 May 22 18:15:53 Silverstone kernel: print_req_error: critical target error, dev sdh, sector 4491512656 May 22 18:15:53 Silverstone kernel: md: disk6 read error, sector=4491512592 May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 Sense Key : 0x5 [current] May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 ASC=0x20 ASCQ=0x0 May 22 18:15:53 Silverstone kernel: sd 13:0:5:0: [sdg] tag#1414 CDB: opcode=0x88 88 00 00 00 00 01 0b b7 0b 50 00 00 00 08 00 00 May 22 18:15:53 Silverstone kernel: print_req_error: critical target error, dev sdg, sector 4491512656 May 22 18:15:53 Silverstone kernel: md: disk7 read error, sector=4491512592 The weird thing that stands out to me is that the errors occur after the 2 disk happen to spin down. Could that be related? Also if it is a hardware defect....I don't have a spare HBA, proper size power supply or SAS cable to do any testing (by swapping them out ) so i am at a loss as to how i should handle this right now. Is there anything i can do? silverstone-diagnostics-20210522-1828.zip Quote Link to comment
JorgeB Posted May 23, 2021 Share Posted May 23, 2021 17 hours ago, Pjhal said: Could that be related? It could, though don't remember spinning issues with WDs, but try disabling spin down to see if it changes anything. 1 Quote Link to comment
Pjhal Posted May 24, 2021 Author Share Posted May 24, 2021 On 5/23/2021 at 11:39 AM, JorgeB said: It could, though don't remember spinning issues with WDs, but try disabling spin down to see if it changes anything. After disabling spin down on all disks and restarting the server it has now been running for 1d and 3 hours without any errors, so i am assuming it is fixed. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.