dzyuba86 Posted November 30, 2022 Share Posted November 30, 2022 I recently added 2 new 10TB drives then moved 2 drives to a cooler location in my tower since I had the time to take the array offline and fiddle with it. Now I have 2 drives saying they're unmountable: wrong or no file system. One drive says device disable contents emulated and the other one says normal operation. Red X and green dot respectively. I am at a loss as to why this happened. I already shut down and tried new Sata cables and tried a different sata port. I still get the same error message. I have the diagnostic log attached. Not sure how to fix this. plexnas-diagnostics-20221130-0919.zip Quote Link to comment
trurl Posted November 30, 2022 Share Posted November 30, 2022 SMART report for both disks looks OK but no SMART tests have been run on either. Possibly you already fixed the hardware problem when you worked on the connections. Check filesystem on each. Be sure to capture the output so you can post it. Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 9 minutes ago, trurl said: SMART report for both disks looks OK but no SMART tests have been run on either. Possibly you already fixed the hardware problem when you worked on the connections. Check filesystem on each. Be sure to capture the output so you can post it. Not sure if it'll have it there but in short it said something about a bunch of errors but didn't give me any options on what steps to take for repair. plexnas-diagnostics-20221130-0951.zip Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 Mostly looks like this Phase 1 - find and verify superblock... - block cache size set to 686216 entries Phase 2 - using internal log - zero log... zero_log: head block 119596 tail block 119596 - scan filesystem freespace and inode maps... Metadata CRC error detected at 0x43d440, xfs_agf block 0x1/0x200 agf has bad CRC for ag 0 Metadata CRC error detected at 0x468740, xfs_agi block 0x2/0x200 agi has bad CRC for ag 0 bad magic # 0x0 for agf 0 bad version # 0 for agf 0 bad length 0 for agf 0, should be 268435455 bad uuid 7e8511b0-2013-0ac6-166a-e877aae91f03 for agf 0 bad magic # 0x0 for agi 0 bad version # 0 for agi 0 bad length # 0 for agi 0, should be 268435455 bad uuid 7e8511b0-2013-0ac6-166a-e877aae91f03 for agi 0 would reset bad agf for ag 0 would reset bad agi for ag 0 bad uncorrected agheader 0, skipping ag... sb_icount 11840, counted 11008 sb_fdblocks 50924179, counted 49052402 root inode chunk not found Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata corruption detected at 0x4379a3, xfs_inode block 0x80/0x4000 Metadata corruption detected at 0x4379a3, xfs_inode block 0xa0/0x4000 bad CRC for inode 128 bad magic number 0x0 on inode 128 bad version number 0x0 on inode 128 bad next_unlinked 0x0 on inode 128 inode identifier 0 mismatch on inode 128 bad CRC for inode 129........... bad CRC for inode 191 bad magic number 0x0 on inode 191 bad version number 0x0 on inode 191 bad next_unlinked 0x0 on inode 191 inode identifier 0 mismatch on inode 191 bad CRC for inode 128, would rewrite bad magic number 0x0 on inode 128, would reset magic number bad version number 0x0 on inode 128, would reset version number bad next_unlinked 0x0 on inode 128, would reset next_unlinked inode identifier 0 mismatch on inode 128 would clear root inode 128......... would have cleared inode 158 bad CRC for inode 159, would rewrite bad magic number 0x0 on inode 159, would reset magic number bad version number 0x0 on inode 159, would reset version number bad next_unlinked 0x0 on inode 159, would reset next_unlinked inode identifier 0 mismatch on inode 159 would have cleared inode 159 imap claims inode 160 is present, but inode cluster is sparse, correcting imap imap claims inode 161 is present, but inode cluster is sparse, correcting imap imap claims inode 162 is present, but inode cluster is sparse, correcting imap imap claims inode 163 is present, but inode cluster is sparse, correcting imap...... No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Wed Nov 30 06:49:18 2022 Phase Start End Duration Phase 1: 11/30 06:46:57 11/30 06:47:14 17 seconds Phase 2: 11/30 06:47:14 11/30 06:47:15 1 second Phase 3: 11/30 06:47:15 11/30 06:49:17 2 minutes, 2 seconds Phase 4: 11/30 06:49:17 11/30 06:49:17 Phase 5: Skipped Phase 6: 11/30 06:49:17 11/30 06:49:18 1 second Phase 7: 11/30 06:49:18 11/30 06:49:18 Total run time: 2 minutes, 21 seconds I ran the check as a -nv Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 The other drive seems worse. Phase 1 - find and verify superblock... - block cache size set to 701112 entries Phase 2 - using internal log - zero log... totally zeroed log zero_log: head block 0 tail block 0 - scan filesystem freespace and inode maps... Metadata CRC error detected at 0x44108d, xfs_bnobt block 0x8/0x1000 btree block 0/1 is suspect, error -74 bad magic # 0 in btbno block 0/1 Metadata CRC error detected at 0x44108d, xfs_cntbt block 0x10/0x1000 btree block 0/2 is suspect, error -74 bad magic # 0 in btcnt block 0/2 Metadata CRC error detected at 0x4728bd, xfs_refcountbt block 0x28/0x1000 btree block 0/5 is suspect, error -74 bad magic # 0 in refcount btree block 0/5 bad refcountbt block count 0, saw 1 agf_freeblks 268435437, counted 0 in ag 0 agf_longest 268435431, counted 0 in ag 0 Metadata CRC error detected at 0x46fd5d, xfs_inobt block 0x18/0x1000 btree block 0/3 is suspect, error -74 bad magic # 0 in inobt block 0/3 Metadata CRC error detected at 0x46fd5d, xfs_finobt block 0x20/0x1000 btree block 0/4 is suspect, error -74 bad magic # 0 in finobt block 0/4 agi_count 64, counted 0 in ag 0 agi_freecount 61, counted 0 in ag 0 agi_freecount 61, counted 0 in ag 0 finobt sb_icount 64, counted 0 sb_ifree 61, counted 0 sb_fdblocks 1952984849, counted 1684549412 root inode chunk not found Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 Metadata corruption detected at 0x4379a3, xfs_inode block 0x80/0x4000 Metadata corruption detected at 0x4379a3, xfs_inode block 0xa0/0x4000 bad CRC for inode 128 bad magic number 0x0 on inode 128 bad version number 0x0 on inode 128 bad next_unlinked 0x0 on inode 128... bad CRC for inode 158, would rewrite bad magic number 0x0 on inode 158, would reset magic number bad version number 0x0 on inode 158, would reset version number bad next_unlinked 0x0 on inode 158, would reset next_unlinked inode identifier 0 mismatch on inode 158 would have cleared inode 158 bad CRC for inode 159, would rewrite bad magic number 0x0 on inode 159, would reset magic number bad version number 0x0 on inode 159, would reset version number bad next_unlinked 0x0 on inode 159, would reset next_unlinked inode identifier 0 mismatch on inode 159 would have cleared inode 159 would rebuild corrupt refcount btrees. No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 Maximum metadata LSN (1:161150) is ahead of log (0:0). Would format log to cycle 4. No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Wed Nov 30 06:50:22 2022 Phase Start End Duration Phase 1: 11/30 06:50:22 11/30 06:50:22 Phase 2: 11/30 06:50:22 11/30 06:50:22 Phase 3: 11/30 06:50:22 11/30 06:50:22 Phase 4: 11/30 06:50:22 11/30 06:50:22 Phase 5: Skipped Phase 6: Skipped Phase 7: Skipped Quote Link to comment
trurl Posted November 30, 2022 Share Posted November 30, 2022 1 hour ago, trurl said: Possibly you already fixed the hardware problem when you worked on the connections. Apparently not. ... Nov 30 06:48:28 PLEXNAS kernel: ata18: hard resetting link Nov 30 06:48:28 PLEXNAS kernel: ata19: found unknown device (class 0) Nov 30 06:48:32 PLEXNAS kernel: ata19: softreset failed (1st FIS failed) ... and lots more. Check connections, SATA and power, both ends, including splitters. Then reboot and post new diagnostics. Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 New diag. I have replaced both power and data cables now with brand new ones. Still same issue. plexnas-diagnostics-20221130-1252.zip Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 Now disk 7 reading as unassigned device. plexnas-diagnostics-20221130-1330.zip Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 Tried doing a rebuild and it failed 1 minute in. Guessing my disk is shot? Quote Link to comment
dzyuba86 Posted November 30, 2022 Author Share Posted November 30, 2022 Would it make sense to pull the drive out, format through my sata enclosure with my laptop and put back in to rebuild or no? The 2 drives having problems are the 2nd newest drives in my array Quote Link to comment
trurl Posted November 30, 2022 Share Posted November 30, 2022 3 hours ago, dzyuba86 said: Tried doing a rebuild and it failed 1 minute in. Did you get diagnostics then? Quote Link to comment
trurl Posted November 30, 2022 Share Posted November 30, 2022 3 hours ago, dzyuba86 said: Would it make sense to pull the drive out, format through my sata enclosure with my laptop and put back in to rebuild or no? No would not make sense. Doesn't matter how it behaves in another system, and doesn't matter whether a disk is formatted or cleared or completely full since rebuild will completely overwrite the disk. We need to see diagnostics taken when the problems occur. Those earlier diagnostics I quoted from seemed to indicate some sort of connection issue. Quote Link to comment
trurl Posted November 30, 2022 Share Posted November 30, 2022 2 minutes ago, trurl said: Doesn't matter how it behaves in another system I guess if it works in another system that would at least tell you something about the disk but not about the problem you are having with it on Unraid. No point in formatting it though. The format done on another system can't be used on Unraid, and rebuild doesn't care whether or how a drive is formatted as mentioned since it will completely overwrite whatever format is there. A better idea would be to run diagnostics from the drive manufacturer, if your enclosure will support that. Quote Link to comment
dzyuba86 Posted December 1, 2022 Author Share Posted December 1, 2022 I downloaded and ran seagate diagnostics tools. The 8TB drive passes the smart test and my laptop can see the drive no problem. Then 10TB drive comes up as uninitialized right away, fails to pass any smart test and I get a failed CRC error when I try to initialize the disc. Will check with warranty support tomorrow about a replacement. I have 1 more year of warranty on it. Still need to figure out what to do about the 8TB and why it isn't working. Quote Link to comment
trurl Posted December 1, 2022 Share Posted December 1, 2022 1 hour ago, dzyuba86 said: Still need to figure out what to do about the 8TB and why it isn't working. Try to use it again in Unraid and post diagnostics 3 hours ago, trurl said: We need to see diagnostics taken when the problems occur. Quote Link to comment
dzyuba86 Posted December 1, 2022 Author Share Posted December 1, 2022 10 hours ago, trurl said: Try to use it again in Unraid and post diagnostics It's currently running a re-build on the 8TB. The 10 is shot and I'm sending it out for a RMA. I'll pick one up at the local tech shop to rebuild since it'll take 1-2 weeks to get the warranty replacement. One of my 3TB drives is throwing pre-fail errors so I'll swap it with a 10. If rebuild fails I'll pull a diagnostic log and post. All I did was deselect the disk from the slot then reassign it and it started the rebuild. Not sure if the smart test reset something on the drive or my deselect, spin up and select to rebuild did the trick. Quote Link to comment
trurl Posted December 1, 2022 Share Posted December 1, 2022 1 hour ago, dzyuba86 said: All I did was deselect the disk from the slot then reassign it and it started the rebuild. If the array is started without a disk, then the disk is reassigned, it considers it a replacement for rebuild. 1 hour ago, dzyuba86 said: One of my 3TB drives is throwing pre-fail errors Which disk was that? Could compromise rebuild, and might be a reason emulation is unmountable. Parity by itself can recover nothing. Parity just allows the contents of a missing disk to be calculated from the contents of all other disks. https://wiki.unraid.net/Manual/Overview#Parity-Protected_Array 1 hour ago, dzyuba86 said: 1-2 weeks to get the warranty replacement Since you have multiple disks with problems I suggest you not wait on that replacement and get another disk immediately to get your array stable again. 1 hour ago, dzyuba86 said: If rebuild fails I'll pull a diagnostic log Might be worth seeing diagnostics even if it succeeds since you have multiple issues. Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Do you have backups of anything important and irreplaceable? Quote Link to comment
dzyuba86 Posted December 2, 2022 Author Share Posted December 2, 2022 Here's the log. Rebuild failed half way. plexnas-diagnostics-20221202-0748.zip Quote Link to comment
trurl Posted December 2, 2022 Share Posted December 2, 2022 On 12/1/2022 at 7:23 AM, dzyuba86 said: One of my 3TB drives is throwing pre-fail errors 22 hours ago, trurl said: Which disk was that? Could compromise rebuild, and might be a reason emulation is unmountable. Quote Link to comment
trurl Posted December 2, 2022 Share Posted December 2, 2022 Connection problems on multiple disks. Maybe you have a power problem? Do you have any power splitters? On 11/30/2022 at 11:17 AM, trurl said: Check connections, SATA and power, both ends, including splitters. Quote Link to comment
dzyuba86 Posted December 2, 2022 Author Share Posted December 2, 2022 Yes. I'm using three 1 to 4 power splitter. Quote Link to comment
trurl Posted December 2, 2022 Share Posted December 2, 2022 On 12/1/2022 at 7:23 AM, dzyuba86 said: One of my 3TB drives is throwing pre-fail errors 3rd try Which disk? Quote Link to comment
dzyuba86 Posted December 2, 2022 Author Share Posted December 2, 2022 7 minutes ago, trurl said: 3rd try Which disk? I got a error popup for an error and when I clicked on it l, it went away and I forgot which disk. So I can't answer that for sure. Quote Link to comment
trurl Posted December 2, 2022 Share Posted December 2, 2022 27 minutes ago, dzyuba86 said: Yes. I'm using three 1 to 4 power splitter. Molex or SATA splitters? Molex is better, handles more current. Rebuilds, parity checks require all disks simultaneously. 11 minutes ago, dzyuba86 said: forgot which disk. Do any disks show SMART warnings on the Dashboard page? Which? Quote Link to comment
dzyuba86 Posted December 2, 2022 Author Share Posted December 2, 2022 25 minutes ago, trurl said: Molex or SATA splitters? Molex is better, handles more current. Rebuilds, parity checks require all disks simultaneously. Do any disks show SMART warnings on the Dashboard page? Which? Sata splitter. I'm thinking to maybe go to the computer shop and see if they have better modular plugs for my power supply. No, no smart errors on dashboard which is super odd. Unless the error corrected itself? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.