mattw Posted December 19, 2022 Share Posted December 19, 2022 (edited) It all started when I wanted to remove my cache drive, the system had been unstable after a ram upgrade and the addition of my first cache drive. So, I shut down and removed the ram. Booted up and everything was good. Did tools>new config and cleared the config and marked the cache drive and not installed. Rebooted to an offline array that wanted me to reassign devices. This is where the problem begins... my parity drive and one of my 2TB data drives are 1 digit different in the serial number and I reversed them and rebooted. System restarted and started a parity check, which I canceled because I thought it was odd. Looked at the disk assignments are realized I had really screwed up. I reassigned the drives and rebooted with parity valid checked to try and avert more damage. Realizing what I had done, I unassigned the parity disk and moved it to the array and started the array. It now tells me that the 2 WD drives are unreadable, I somewhat expected that since it likely over wrote part of one or the other. The cache drive that started all of this was on a PCI controller, I removed the controller and moved the empty cache drive to one of my mobo sata controller ports. So, how screwed am I? I do not have a backup since getting to an offsite backup at 30Mbps is painful. The loss of years worth of movies and all of my digitized cd collection will be painful. I am in this state now... parity drive in array and array not starting, but it knows about my shares. The system now thinks my array is 70% full, was 49% before this adventure. Interesting stuff from the logs... disk 3 sees corruption, was the disk swapped in place of the parity drive. Disk 5 is showing unsupported file system (real parity drive). Dec 18 20:02:36 Tower emhttpd: shcmd (154): mkdir -p /mnt/disk1 Dec 18 20:02:36 Tower emhttpd: shcmd (155): mount -t xfs -o noatime,nouuid /dev/md1 /mnt/disk1 Dec 18 20:02:36 Tower kernel: SGI XFS with ACLs, security attributes, no debug enabled Dec 18 20:02:36 Tower kernel: XFS (md1): Mounting V5 Filesystem Dec 18 20:02:36 Tower kernel: XFS (md1): Ending clean mount Dec 18 20:02:36 Tower emhttpd: shcmd (156): xfs_growfs /mnt/disk1 Dec 18 20:02:36 Tower root: meta-data=/dev/md1 isize=512 agcount=4, agsize=91571160 blks Dec 18 20:02:36 Tower root: = sectsz=512 attr=2, projid32bit=1 Dec 18 20:02:36 Tower root: = crc=1 finobt=1, sparse=1, rmapbt=0 Dec 18 20:02:36 Tower root: = reflink=1 bigtime=1 inobtcount=1 Dec 18 20:02:36 Tower root: data = bsize=4096 blocks=366284638, imaxpct=5 Dec 18 20:02:36 Tower root: = sunit=0 swidth=0 blks Dec 18 20:02:36 Tower root: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Dec 18 20:02:36 Tower root: log =internal log bsize=4096 blocks=178849, version=2 Dec 18 20:02:36 Tower root: = sectsz=512 sunit=0 blks, lazy-count=1 Dec 18 20:02:36 Tower root: realtime =none extsz=4096 blocks=0, rtextents=0 Dec 18 20:02:36 Tower emhttpd: shcmd (157): mkdir -p /mnt/disk2 Dec 18 20:02:36 Tower emhttpd: shcmd (158): mount -t xfs -o noatime,nouuid /dev/md2 /mnt/disk2 Dec 18 20:02:36 Tower kernel: XFS (md2): Mounting V5 Filesystem Dec 18 20:02:37 Tower kernel: XFS (md2): Ending clean mount Dec 18 20:02:37 Tower emhttpd: shcmd (159): xfs_growfs /mnt/disk2 Dec 18 20:02:37 Tower root: meta-data=/dev/md2 isize=512 agcount=4, agsize=91571160 blks Dec 18 20:02:37 Tower root: = sectsz=512 attr=2, projid32bit=1 Dec 18 20:02:37 Tower root: = crc=1 finobt=1, sparse=1, rmapbt=0 Dec 18 20:02:37 Tower root: = reflink=1 bigtime=1 inobtcount=1 Dec 18 20:02:37 Tower root: data = bsize=4096 blocks=366284638, imaxpct=5 Dec 18 20:02:37 Tower root: = sunit=0 swidth=0 blks Dec 18 20:02:37 Tower root: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Dec 18 20:02:37 Tower root: log =internal log bsize=4096 blocks=178849, version=2 Dec 18 20:02:37 Tower root: = sectsz=512 sunit=0 blks, lazy-count=1 Dec 18 20:02:37 Tower root: realtime =none extsz=4096 blocks=0, rtextents=0 Dec 18 20:02:37 Tower emhttpd: shcmd (160): mkdir -p /mnt/disk3 Dec 18 20:02:37 Tower emhttpd: shcmd (161): blkid -t TYPE='xfs' /dev/md3 &> /dev/null Dec 18 20:02:37 Tower emhttpd: shcmd (162): mount -t xfs -o noatime,nouuid /dev/md3 /mnt/disk3 Dec 18 20:02:37 Tower kernel: XFS (md3): Mounting V5 Filesystem Dec 18 20:02:37 Tower kernel: XFS (md3): Corruption warning: Metadata has LSN (1:115531) ahead of current LSN (1:71954). Please unmount and run xfs_repair (>= v4.3) to resolve. Dec 18 20:02:37 Tower kernel: XFS (md3): log mount/recovery failed: error -22 Dec 18 20:02:37 Tower kernel: XFS (md3): log mount failed Dec 18 20:02:37 Tower root: mount: /mnt/disk3: wrong fs type, bad option, bad superblock on /dev/md3, missing codepage or helper program, or other error. Dec 18 20:02:37 Tower root: dmesg(1) may have more information after failed mount system call. Dec 18 20:02:37 Tower emhttpd: shcmd (162): exit status: 32 Dec 18 20:02:37 Tower emhttpd: /mnt/disk3 mount error: Wrong or no file system Dec 18 20:02:37 Tower emhttpd: shcmd (163): umount /mnt/disk3 Dec 18 20:02:37 Tower root: umount: /mnt/disk3: not mounted. Dec 18 20:02:37 Tower emhttpd: shcmd (163): exit status: 32 Dec 18 20:02:37 Tower emhttpd: shcmd (164): rmdir /mnt/disk3 Dec 18 20:02:37 Tower emhttpd: shcmd (165): mkdir -p /mnt/disk4 Dec 18 20:02:37 Tower emhttpd: shcmd (166): mount -t xfs -o noatime,nouuid /dev/md4 /mnt/disk4 Dec 18 20:02:37 Tower kernel: XFS (md4): Mounting V5 Filesystem Dec 18 20:02:37 Tower kernel: XFS (md4): Ending clean mount Dec 18 20:02:37 Tower emhttpd: shcmd (167): xfs_growfs /mnt/disk4 Dec 18 20:02:37 Tower root: meta-data=/dev/md4 isize=512 agcount=4, agsize=122094660 blks Dec 18 20:02:37 Tower root: = sectsz=512 attr=2, projid32bit=1 Dec 18 20:02:37 Tower root: = crc=1 finobt=1, sparse=1, rmapbt=0 Dec 18 20:02:37 Tower root: = reflink=1 bigtime=1 inobtcount=1 Dec 18 20:02:37 Tower root: data = bsize=4096 blocks=488378638, imaxpct=5 Dec 18 20:02:37 Tower root: = sunit=0 swidth=0 blks Dec 18 20:02:37 Tower root: naming =version 2 bsize=4096 ascii-ci=0, ftype=1 Dec 18 20:02:37 Tower root: log =internal log bsize=4096 blocks=238466, version=2 Dec 18 20:02:37 Tower root: = sectsz=512 sunit=0 blks, lazy-count=1 Dec 18 20:02:37 Tower root: realtime =none extsz=4096 blocks=0, rtextents=0 Dec 18 20:02:37 Tower emhttpd: shcmd (168): mkdir -p /mnt/disk5 Dec 18 20:02:37 Tower emhttpd: shcmd (169): blkid -t TYPE='xfs' /dev/md5 &> /dev/null Dec 18 20:02:37 Tower emhttpd: shcmd (169): exit status: 2 Dec 18 20:02:37 Tower emhttpd: shcmd (170): blkid -t TYPE='btrfs' /dev/md5 &> /dev/null Dec 18 20:02:37 Tower emhttpd: shcmd (170): exit status: 2 Dec 18 20:02:37 Tower emhttpd: shcmd (171): blkid -t TYPE='reiserfs' /dev/md5 &> /dev/null Dec 18 20:02:37 Tower emhttpd: shcmd (171): exit status: 2 Dec 18 20:02:37 Tower emhttpd: /mnt/disk5 mount error: Unsupported or no file system Dec 18 20:02:37 Tower emhttpd: shcmd (172): umount /mnt/disk5 Dec 18 20:02:38 Tower root: umount: /mnt/disk5: not mounted. Dec 18 20:02:38 Tower emhttpd: shcmd (172): exit status: 32 Dec 18 20:02:38 Tower emhttpd: shcmd (173): rmdir /mnt/disk5 Dec 18 20:02:38 Tower emhttpd: shcmd (174): sync Dec 18 20:02:38 Tower emhttpd: shcmd (175): mkdir /mnt/user0 Dec 18 20:02:38 Tower emhttpd: shcmd (176): /usr/local/sbin/shfs /mnt/user0 -disks 30 -o default_permissions,allow_other,noatime |& logger Dec 18 20:02:38 Tower emhttpd: shcmd (177): mkdir /mnt/user Dec 18 20:02:38 Tower emhttpd: shcmd (178): /usr/local/sbin/shfs /mnt/user -disks 31 -o default_permissions,allow_other,noatime -o remember=0 |& logger Dec 18 20:02:38 Tower emhttpd: shcmd (180): /usr/local/sbin/update_cron Fix common problems is also reporting this for every share on the drive, somewhat expected i think. Edited December 19, 2022 by mattw More info Quote Link to comment
mattw Posted December 19, 2022 Author Share Posted December 19, 2022 So, as a followup... with the array in this state, I am still seeing my shares and the files? I am copying off my movies and kids pics to be safe. Quote Link to comment
JorgeB Posted December 19, 2022 Share Posted December 19, 2022 You might still be able to recover, at least some data, it will depend on the amount of damaged done to parity and disk3, you can try to recover data from both the old disk3 and the emulated disk3 and see if one is better than the other. When you are done with the backups first restore the array: -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and re-assign parity do the correct slot -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk3 -Start array (in normal mode now), and post the diagnostics. Quote Link to comment
mattw Posted December 19, 2022 Author Share Posted December 19, 2022 I will try this evening to make these changes. I am curious... the dashboard says that the array is offline, but my shares are up... Why? Quote Link to comment
mattw Posted December 19, 2022 Author Share Posted December 19, 2022 5 hours ago, JorgeB said: You might still be able to recover, at least some data, it will depend on the amount of damaged done to parity and disk3, you can try to recover data from both the old disk3 and the emulated disk3 and see if one is better than the other. When you are done with the backups first restore the array: -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and re-assign parity do the correct slot This tool seems to get me in more trouble... Right now, I have the parity disk as disk 5. The server says the array is stopped. So, do I move the disk 5 to the parity slot before running new config or after? It seems to me if I move the disk 5 it is going to tell me that it is missing and that will be a problem. Also in the current state I do not have the "parity is already valid" and "maintenance mode" boxes. -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk3 -Start array (in normal mode now), and post the diagnostics. Quote Link to comment
JorgeB Posted December 19, 2022 Share Posted December 19, 2022 That screenshot is not showing the new config. Quote Link to comment
mattw Posted December 19, 2022 Author Share Posted December 19, 2022 Correct, I have not made the change yet. Do I need to move the drive configs around before or after the config new? I seem to have a fundamental misunderstanding of the config new tool. Quote Link to comment
JorgeB Posted December 19, 2022 Share Posted December 19, 2022 1 hour ago, mattw said: Do I need to move the drive configs around before or after the config new? No, you only need to assign disk5 as parity, assuming the other ones are correctly assigned. Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 (edited) Ok, so I did the new config and assigned all. Went back to the array screen and it will not let me unassign disk 5. When I tell it "no device" it just leaves it as is. I have attached my current diags. So, I am a network engineer for a major university and have been for 32 years, but a server guy I am not! I feel really dumb at the moment. Why does my array and shares appear to be online but the dashboard tells me that it is stopped? This is my array options screen, no option for valid parity or to start or stop the array. tower-diagnostics-20221219-2105.zip Edited December 20, 2022 by mattw More info. Quote Link to comment
JorgeB Posted December 20, 2022 Share Posted December 20, 2022 No new config was done in the diags posted, if you are using Firefox reboot first then try again with a different browser. Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 (edited) Ok, different browser which has never even connected to the server... New config, selected all and applied. Back to main and tried to reassign drive 5 and when I tell it "no device" it will not remove disk 5 no matter what. It still insists my array is offline, but I can still get to all of my shares. I am still getting nightly emails from it telling me it is healthy...? tower-diagnostics-20221220-0933.zip Edited December 20, 2022 by mattw Quote Link to comment
trurl Posted December 20, 2022 Share Posted December 20, 2022 Do you have any other device (including mobile app) or other browsers or browser tabs accessing your Unraid webUI? Quote Link to comment
JorgeB Posted December 20, 2022 Share Posted December 20, 2022 If those the latest diags you didn't reboot. 5 hours ago, JorgeB said: reboot first Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 (edited) Ok, reboot has been done and new drive config is in place, system was set to boot into maintenance mode and parity is marked valid. Should I start the array? How do I tell if I am in maintenance mode? Sorry to be so dense. So, if I am following correctly... I should check maintenance mode and proceed as below? IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk3 -Start array (in normal mode now), and post the diagnostics. tower-diagnostics-20221220-1151.zip Edited December 20, 2022 by mattw Quote Link to comment
JorgeB Posted December 20, 2022 Share Posted December 20, 2022 16 minutes ago, mattw said: system was set to boot into maintenance mode and parity is marked valid. If this was already done do this part now: 17 minutes ago, mattw said: -Unassign disk3 -Start array (in normal mode now), and post the diagnostics. Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 It was done, before the requested reboot. Now I do not have the option on screen to do anything with parity and parity indicates valid. I tried to start the array and the note at the bottom of the screen indicates " Array Stopped•stale configuration". Quote Link to comment
JorgeB Posted December 20, 2022 Share Posted December 20, 2022 50 minutes ago, mattw said: Array Stopped•stale configuration". Are you using Firefox? Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 (edited) I have tried with Firefox (my normal browser) and with Microsoft Edge. Both seem to yield the same results. BTW, just did a reboot as the only option in the "Array Operations" tab was to reboot or shutdown. I can't run it on my phone, S10+, does not render well enough to trust pushing buttons with my old eyes. This is so frustrating, had this server running for years on 5.0 until my key died and life was in the way and I could not take enough time to troubleshoot it. Then adding cache drive and ram and my lack of abilities got me to this point. I will quit using Firefox during this process, it must have real issues with Unraid. After the reboot from Edge, I have the option to start the array and to enable maintenance mode when I do it. The stale config message is now gone. Edited December 20, 2022 by mattw Quote Link to comment
trurl Posted December 20, 2022 Share Posted December 20, 2022 26 minutes ago, mattw said: start the array and to enable maintenance mode NO 2 hours ago, JorgeB said: 2 hours ago, mattw said: -Unassign disk3 -Start array (in normal mode now), and post the diagnostics. Quote Link to comment
mattw Posted December 20, 2022 Author Share Posted December 20, 2022 Ok, the array is online minus disk 3. tower-diagnostics-20221220-1552.zip Quote Link to comment
trurl Posted December 21, 2022 Share Posted December 21, 2022 The next step is check filesystem on emulated disk3. Capture the output and post it. Quote Link to comment
mattw Posted December 21, 2022 Author Share Posted December 21, 2022 So, following the above guide... If the file system is XFS or ReiserFS (but NOT BTRFS), then you must start the array in Maintenance mode, by clicking the Maintenance mode check box before clicking the Start button. This starts the unRAID driver but does not mount any of the drives. So, I am in maintenance mode as requested in the doc. You should see a page of options for that drive, beginning with various partition, file system format, and spin down settings. The section following that is the one you want, titled Check Filesystem Status. There is a box with the 2 words Not available in it. This is the command output box, where the progress and results of the command will be displayed. Below that is the Check button that starts the test or repair, followed by the options box where you can type in options for the test/repair command. The options box may already have default options in it, for a read-only check of the file system. For more help, click the Help button in the upper right. The result of clicking the disk 3 gives me none of the options I would expect to see in the doc. So, with an emulated drive how do I get to the menus I need to see? I do see the options on one of my installed and live drives. Quote Link to comment
trurl Posted December 21, 2022 Share Posted December 21, 2022 Do you know what filesystem is supposed to be on the disk? Quote Link to comment
mattw Posted December 21, 2022 Author Share Posted December 21, 2022 Yes, it was xfs. Quote Link to comment
mattw Posted December 21, 2022 Author Share Posted December 21, 2022 I should add that files that were on it are not being emulated... they are just not there. Should I just add it as a blank drive at this point? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.