14935 Posted May 23, 2021 Share Posted May 23, 2021 Hello, I recently discovered my server (v. 6.8.3) with one of two parity drives disabled, and a data drive that shows up as having no file system. I ran extended SMART tests on both drives and nothing in the reports jumps out at me. I got a copy of the syslog before shutting down but neglected to collect diagnostics. I had a previous incident of 3 disabled drives after a power failure (server was connected to a small UPS) that seemed to resolve itself after a restart, but this second incident has me thinking it is not just a fluke. I had been using an older power supply and 2 I/O Crest 4 port SATA controllers (which use Marvell 9215 chipset). The power supply has been replaced and I am now using a Dell H310 HBA and onboard SATA (thank you ArtOfServer). This bit of my syslog suggests doing an XFS repair: May 13 08:13:14 Unraid2 emhttpd: shcmd (49): mkdir -p /mnt/disk6 May 13 08:13:14 Unraid2 emhttpd: shcmd (50): mount -t xfs -o noatime,nodiratime /dev/md6 /mnt/disk6 May 13 08:13:14 Unraid2 kernel: XFS (md6): Mounting V5 Filesystem May 13 08:13:14 Unraid2 kernel: XFS (md6): Corruption warning: Metadata has LSN (4:3579107) ahead of current LSN (4:1815408). Please unmount and run xfs_repair (>= v4.3) to resolve. May 13 08:13:14 Unraid2 kernel: XFS (md6): log mount/recovery failed: error -22 May 13 08:13:14 Unraid2 kernel: XFS (md6): log mount failed May 13 08:13:14 Unraid2 root: mount: /mnt/disk6: wrong fs type, bad option, bad superblock on /dev/md6, missing codepage or helper program, or other error. May 13 08:13:14 Unraid2 emhttpd: shcmd (50): exit status: 32 May 13 08:13:14 Unraid2 emhttpd: /mnt/disk6 mount error: No file system May 13 08:13:14 Unraid2 emhttpd: shcmd (51): umount /mnt/disk6 May 13 08:13:14 Unraid2 root: umount: /mnt/disk6: not mounted. May 13 08:13:14 Unraid2 emhttpd: shcmd (51): exit status: 32 May 13 08:13:14 Unraid2 emhttpd: shcmd (52): rmdir /mnt/disk6 My question is, should I try to resolve the file system issue first, or the disabled parity drive? With 2 drives disabled, I am nervous about doing the wrong thing... log files attached: Syslog preshutdown, diagnostics after rebooting but before swapping PS and controller card. Thanks very much for any advice! unraid2-diagnostics-20210514-1156.zip unraid2-smart-20210513-0819.zip unraid2-smart-20210514-1153.zip unraid2-syslog-20210509-2237.zip Quote Link to comment
trurl Posted May 24, 2021 Share Posted May 24, 2021 Start the array and post new diagnostics. Can't tell anything about unmountable disks without the array started. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 Hi trurl, yes, that makes sense... Here are the new diagnostics. Thanks for your quick reply. unraid2-diagnostics-20210523-2148.zip Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 First try to repair the filesystem on the emulated disk6, if it's successfully repaired and all data looks there you can rebuild on top, if not you can use a spare or do a new config instead. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 I tried doing a repair dry run with the n switch and got this: Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 Reboot and try again, or do it on the console. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 I got the same result after rebooting. THis is the final part of the output when running from the console: would clear inode number in entry at offset 408... would rebuild corrupt refcount btrees. No modify flag set, skipping phase 5 Inode allocation btrees are too corrupted, skipping phases 6 and 7 Maximum metadata LSN (6:2812895) is ahead of log (3:1819503). Would format log to cycle 9. No modify flag set, skipping filesystem flush and exiting. Is the "skipping phases 6 and 7" a bad sign? I could remove the disk and replace it if needed. Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 24 minutes ago, 14935 said: Is the "skipping phases 6 and 7" a bad sign? No, it just means you're running it with the no modify flash (-n), run it again without it. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 O.K. It's alright to use the -L switch as it is suggesting? Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 The repair finished and Disk6 shows a filesystem again, but now I have a lost and found folder with almost 300,000 files and folders in it..... Would it make sense to try rebuilding disk6 from my remaining good parity disk? If so, would I do that in conjunction with rebuilding my second (offline) parity disk? Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 5 minutes ago, 14935 said: Would it make sense to try rebuilding disk6 from my remaining good parity disk? Rebuilding from parity will result in the same exact content you are seeing in the emulated disk, if the emulated filesystem is very damaged, it might be better to re-sync parity with the old disk or use it to copy the data back to the array if you prefer to rebuild to a spare. First thing to do is to confirm the old disk is still mounting and contents are OK, to do that first unassign that disk from the array, then start array, emulated disk will remain as is for now, then change the xfs UUID to the old unassigned disk with: xfs_admin -U generate /dev/sdX1 Replace X with correct letter, after that use the UD plugin to mount old disk and check data, if all looks fine then do either option mentioned above, feel free to ask if there are any doubts. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 Hi JorgeB, Thanks for your help. I think I am getting confused. The files and folders under lost & found are the emulated disk? If I was to rebuild back on to Disk6, would all that data end up back in its' proper place instead of just dumped into one giant directory with meaningless file names and no file extensions? I am also not clear on this part : " then change the xfs UUID to the old unassigned disk with: xfs_admin -U generate /dev/sdX1 Replace X with correct letter" I have been referring to Spaceinvader One's "How to fix XFS file system corruption" video. Do I need to watch that again? Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 8 minutes ago, 14935 said: The files and folders under lost & found are the emulated disk? Yes, assuming you were fixing the filesystem on disk6 (md6). 14 minutes ago, 14935 said: If I was to rebuild back on to Disk6, would all that data end up back in its' proper place instead of just dumped into one giant directory with meaningless file names and no file extensions? No, like mentioned what you see on the emulated disk is what you'll see after a rebuild. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 Now I am really confused. How does having 2 parity drives help me if my data gets scrambled on one disk and the parity information can't help "unscramble" it? Quote Link to comment
JorgeB Posted May 24, 2021 Share Posted May 24, 2021 Parity is for replacing one or more failed disks, it can't help with filesystem corruption, hence why parity (and Unraid or any other RAID array) isn't a backup, it just adds redundancy. Quote Link to comment
14935 Posted May 24, 2021 Author Share Posted May 24, 2021 Thanks again for your help. I'm feeling like I might be on the verge of taking this from bad to worse, so I am going to put it aside for a day or two before continuing. Quote Link to comment
trurl Posted May 25, 2021 Share Posted May 25, 2021 On 5/24/2021 at 12:34 PM, 14935 said: put it aside for a day or two before continuing. You have no redundancy currently. The single enabled parity disk allows you to emulate the disabled data disk, but if another data disk has a problem you wouldn't be able to recover either. Quote Link to comment
trurl Posted May 27, 2021 Share Posted May 27, 2021 @14935 On 5/25/2021 at 4:25 PM, trurl said: You have no redundancy currently. The single enabled parity disk allows you to emulate the disabled data disk, but if another data disk has a problem you wouldn't be able to recover either. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.