So, I think I screwed myself tonight, looking for help. 6.11.5+

mattw · December 19, 2022

It all started when I wanted to remove my cache drive, the system had been unstable after a ram upgrade and the addition of my first cache drive. So, I shut down and removed the ram. Booted up and everything was good. Did tools>new config and cleared the config and marked the cache drive and not installed. Rebooted to an offline array that wanted me to reassign devices. This is where the problem begins... my parity drive and one of my 2TB data drives are 1 digit different in the serial number and I reversed them and rebooted. System restarted and started a parity check, which I canceled because I thought it was odd. Looked at the disk assignments are realized I had really screwed up. I reassigned the drives and rebooted with parity valid checked to try and avert more damage. Realizing what I had done, I unassigned the parity disk and moved it to the array and started the array. It now tells me that the 2 WD drives are unreadable, I somewhat expected that since it likely over wrote part of one or the other. The cache drive that started all of this was on a PCI controller, I removed the controller and moved the empty cache drive to one of my mobo sata controller ports.

So, how screwed am I? I do not have a backup since getting to an offsite backup at 30Mbps is painful. The loss of years worth of movies and all of my digitized cd collection will be painful.

I am in this state now... parity drive in array and array not starting, but it knows about my shares.

image.png.19d7487b65a157ac357b7e32ac44973f.png

The system now thinks my array is 70% full, was 49% before this adventure.

Interesting stuff from the logs... disk 3 sees corruption, was the disk swapped in place of the parity drive. Disk 5 is showing unsupported file system (real parity drive).

Dec 18 20:02:36 Tower  emhttpd: shcmd (154): mkdir -p /mnt/disk1
Dec 18 20:02:36 Tower  emhttpd: shcmd (155): mount -t xfs -o noatime,nouuid /dev/md1 /mnt/disk1
Dec 18 20:02:36 Tower kernel: SGI XFS with ACLs, security attributes, no debug enabled
Dec 18 20:02:36 Tower kernel: XFS (md1): Mounting V5 Filesystem
Dec 18 20:02:36 Tower kernel: XFS (md1): Ending clean mount
Dec 18 20:02:36 Tower  emhttpd: shcmd (156): xfs_growfs /mnt/disk1
Dec 18 20:02:36 Tower root: meta-data=/dev/md1               isize=512    agcount=4, agsize=91571160 blks
Dec 18 20:02:36 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Dec 18 20:02:36 Tower root:          =                       crc=1        finobt=1, sparse=1, rmapbt=0
Dec 18 20:02:36 Tower root:          =                       reflink=1    bigtime=1 inobtcount=1
Dec 18 20:02:36 Tower root: data     =                       bsize=4096   blocks=366284638, imaxpct=5
Dec 18 20:02:36 Tower root:          =                       sunit=0      swidth=0 blks
Dec 18 20:02:36 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Dec 18 20:02:36 Tower root: log      =internal log           bsize=4096   blocks=178849, version=2
Dec 18 20:02:36 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Dec 18 20:02:36 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Dec 18 20:02:36 Tower  emhttpd: shcmd (157): mkdir -p /mnt/disk2
Dec 18 20:02:36 Tower  emhttpd: shcmd (158): mount -t xfs -o noatime,nouuid /dev/md2 /mnt/disk2
Dec 18 20:02:36 Tower kernel: XFS (md2): Mounting V5 Filesystem
Dec 18 20:02:37 Tower kernel: XFS (md2): Ending clean mount
Dec 18 20:02:37 Tower  emhttpd: shcmd (159): xfs_growfs /mnt/disk2
Dec 18 20:02:37 Tower root: meta-data=/dev/md2               isize=512    agcount=4, agsize=91571160 blks
Dec 18 20:02:37 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Dec 18 20:02:37 Tower root:          =                       crc=1        finobt=1, sparse=1, rmapbt=0
Dec 18 20:02:37 Tower root:          =                       reflink=1    bigtime=1 inobtcount=1
Dec 18 20:02:37 Tower root: data     =                       bsize=4096   blocks=366284638, imaxpct=5
Dec 18 20:02:37 Tower root:          =                       sunit=0      swidth=0 blks
Dec 18 20:02:37 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Dec 18 20:02:37 Tower root: log      =internal log           bsize=4096   blocks=178849, version=2
Dec 18 20:02:37 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Dec 18 20:02:37 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Dec 18 20:02:37 Tower  emhttpd: shcmd (160): mkdir -p /mnt/disk3
Dec 18 20:02:37 Tower  emhttpd: shcmd (161): blkid -t TYPE='xfs' /dev/md3 &> /dev/null
Dec 18 20:02:37 Tower  emhttpd: shcmd (162): mount -t xfs -o noatime,nouuid /dev/md3 /mnt/disk3
Dec 18 20:02:37 Tower kernel: XFS (md3): Mounting V5 Filesystem
Dec 18 20:02:37 Tower kernel: XFS (md3): Corruption warning: Metadata has LSN (1:115531) ahead of current LSN (1:71954). Please unmount and run xfs_repair (>= v4.3) to resolve.
Dec 18 20:02:37 Tower kernel: XFS (md3): log mount/recovery failed: error -22
Dec 18 20:02:37 Tower kernel: XFS (md3): log mount failed
Dec 18 20:02:37 Tower root: mount: /mnt/disk3: wrong fs type, bad option, bad superblock on /dev/md3, missing codepage or helper program, or other error.
Dec 18 20:02:37 Tower root:        dmesg(1) may have more information after failed mount system call.
Dec 18 20:02:37 Tower  emhttpd: shcmd (162): exit status: 32
Dec 18 20:02:37 Tower  emhttpd: /mnt/disk3 mount error: Wrong or no file system
Dec 18 20:02:37 Tower  emhttpd: shcmd (163): umount /mnt/disk3
Dec 18 20:02:37 Tower root: umount: /mnt/disk3: not mounted.
Dec 18 20:02:37 Tower  emhttpd: shcmd (163): exit status: 32
Dec 18 20:02:37 Tower  emhttpd: shcmd (164): rmdir /mnt/disk3
Dec 18 20:02:37 Tower  emhttpd: shcmd (165): mkdir -p /mnt/disk4
Dec 18 20:02:37 Tower  emhttpd: shcmd (166): mount -t xfs -o noatime,nouuid /dev/md4 /mnt/disk4
Dec 18 20:02:37 Tower kernel: XFS (md4): Mounting V5 Filesystem
Dec 18 20:02:37 Tower kernel: XFS (md4): Ending clean mount
Dec 18 20:02:37 Tower  emhttpd: shcmd (167): xfs_growfs /mnt/disk4
Dec 18 20:02:37 Tower root: meta-data=/dev/md4               isize=512    agcount=4, agsize=122094660 blks
Dec 18 20:02:37 Tower root:          =                       sectsz=512   attr=2, projid32bit=1
Dec 18 20:02:37 Tower root:          =                       crc=1        finobt=1, sparse=1, rmapbt=0
Dec 18 20:02:37 Tower root:          =                       reflink=1    bigtime=1 inobtcount=1
Dec 18 20:02:37 Tower root: data     =                       bsize=4096   blocks=488378638, imaxpct=5
Dec 18 20:02:37 Tower root:          =                       sunit=0      swidth=0 blks
Dec 18 20:02:37 Tower root: naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
Dec 18 20:02:37 Tower root: log      =internal log           bsize=4096   blocks=238466, version=2
Dec 18 20:02:37 Tower root:          =                       sectsz=512   sunit=0 blks, lazy-count=1
Dec 18 20:02:37 Tower root: realtime =none                   extsz=4096   blocks=0, rtextents=0
Dec 18 20:02:37 Tower  emhttpd: shcmd (168): mkdir -p /mnt/disk5
Dec 18 20:02:37 Tower  emhttpd: shcmd (169): blkid -t TYPE='xfs' /dev/md5 &> /dev/null
Dec 18 20:02:37 Tower  emhttpd: shcmd (169): exit status: 2
Dec 18 20:02:37 Tower  emhttpd: shcmd (170): blkid -t TYPE='btrfs' /dev/md5 &> /dev/null
Dec 18 20:02:37 Tower  emhttpd: shcmd (170): exit status: 2
Dec 18 20:02:37 Tower  emhttpd: shcmd (171): blkid -t TYPE='reiserfs' /dev/md5 &> /dev/null
Dec 18 20:02:37 Tower  emhttpd: shcmd (171): exit status: 2
Dec 18 20:02:37 Tower  emhttpd: /mnt/disk5 mount error: Unsupported or no file system
Dec 18 20:02:37 Tower  emhttpd: shcmd (172): umount /mnt/disk5
Dec 18 20:02:38 Tower root: umount: /mnt/disk5: not mounted.
Dec 18 20:02:38 Tower  emhttpd: shcmd (172): exit status: 32
Dec 18 20:02:38 Tower  emhttpd: shcmd (173): rmdir /mnt/disk5
Dec 18 20:02:38 Tower  emhttpd: shcmd (174): sync
Dec 18 20:02:38 Tower  emhttpd: shcmd (175): mkdir /mnt/user0
Dec 18 20:02:38 Tower  emhttpd: shcmd (176): /usr/local/sbin/shfs /mnt/user0 -disks 30 -o default_permissions,allow_other,noatime  |& logger
Dec 18 20:02:38 Tower  emhttpd: shcmd (177): mkdir /mnt/user
Dec 18 20:02:38 Tower  emhttpd: shcmd (178): /usr/local/sbin/shfs /mnt/user -disks 31 -o default_permissions,allow_other,noatime -o remember=0  |& logger
Dec 18 20:02:38 Tower  emhttpd: shcmd (180): /usr/local/sbin/update_cron

Fix common problems is also reporting this for every share on the drive, somewhat expected i think.

Edited December 19, 2022 by mattw
More info

mattw · December 19, 2022

So, as a followup... with the array in this state, I am still seeing my shares and the files? I am copying off my movies and kids pics to be safe.

JorgeB · December 19, 2022

You might still be able to recover, at least some data, it will depend on the amount of damaged done to parity and disk3, you can try to recover data from both the old disk3 and the emulated disk3 and see if one is better than the other.

When you are done with the backups first restore the array:

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and re-assign parity do the correct slot
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk3
-Start array (in normal mode now), and post the diagnostics.

mattw · December 19, 2022

I will try this evening to make these changes.

I am curious... the dashboard says that the array is offline, but my shares are up... Why?

mattw · December 19, 2022

5 hours ago, JorgeB said:

You might still be able to recover, at least some data, it will depend on the amount of damaged done to parity and disk3, you can try to recover data from both the old disk3 and the emulated disk3 and see if one is better than the other.

When you are done with the backups first restore the array:

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and re-assign parity do the correct slot

This tool seems to get me in more trouble... Right now, I have the parity disk as disk 5. The server says the array is stopped. So, do I move the disk 5 to the parity slot before running new config or after? It seems to me if I move the disk 5 it is going to tell me that it is missing and that will be a problem. Also in the current state I do not have the "parity is already valid" and "maintenance mode" boxes.
-IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk3
-Start array (in normal mode now), and post the diagnostics.

JorgeB · December 19, 2022

That screenshot is not showing the new config.

mattw · December 19, 2022

Correct, I have not made the change yet. Do I need to move the drive configs around before or after the config new? I seem to have a fundamental misunderstanding of the config new tool.

JorgeB · December 19, 2022

1 hour ago, mattw said:

Do I need to move the drive configs around before or after the config new?

No, you only need to assign disk5 as parity, assuming the other ones are correctly assigned.

mattw · December 20, 2022

Ok, so I did the new config and assigned all. Went back to the array screen and it will not let me unassign disk 5. When I tell it "no device" it just leaves it as is.

I have attached my current diags. So, I am a network engineer for a major university and have been for 32 years, but a server guy I am not! I feel really dumb at the moment. Why does my array and shares appear to be online but the dashboard tells me that it is stopped? This is my array options screen, no option for valid parity or to start or stop the array.

tower-diagnostics-20221219-2105.zip

Edited December 20, 2022 by mattw
More info.

JorgeB · December 20, 2022

No new config was done in the diags posted, if you are using Firefox reboot first then try again with a different browser.

mattw · December 20, 2022

Ok, different browser which has never even connected to the server... New config, selected all and applied. Back to main and tried to reassign drive 5 and when I tell it "no device" it will not remove disk 5 no matter what. It still insists my array is offline, but I can still get to all of my shares.

I am still getting nightly emails from it telling me it is healthy...?

image.png.37ce63680ef155a35a227e6c294c10aa.png

tower-diagnostics-20221220-0933.zip

Edited December 20, 2022 by mattw

trurl · December 20, 2022

Do you have any other device (including mobile app) or other browsers or browser tabs accessing your Unraid webUI?

JorgeB · December 20, 2022

If those the latest diags you didn't reboot.

5 hours ago, JorgeB said:

reboot first

mattw · December 20, 2022

Ok, reboot has been done and new drive config is in place, system was set to boot into maintenance mode and parity is marked valid. Should I start the array? How do I tell if I am in maintenance mode? Sorry to be so dense.

So, if I am following correctly... I should check maintenance mode and proceed as below?

IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked)
-Stop array
-Unassign disk3
-Start array (in normal mode now), and post the diagnostics.

image.png.47eddc28dade3da3ad2eb3f27ed24d45.png

image.png.1ccb206c57cae5b3ee277ee4ea8de6c3.png

tower-diagnostics-20221220-1151.zip

Edited December 20, 2022 by mattw

JorgeB · December 20, 2022

16 minutes ago, mattw said:

system was set to boot into maintenance mode and parity is marked valid.

If this was already done do this part now:

17 minutes ago, mattw said:

-Unassign disk3
-Start array (in normal mode now), and post the diagnostics.

mattw · December 20, 2022

It was done, before the requested reboot. Now I do not have the option on screen to do anything with parity and parity indicates valid. I tried to start the array and the note at the bottom of the screen indicates " Array Stopped•stale configuration".

JorgeB · December 20, 2022

50 minutes ago, mattw said:

Array Stopped•stale configuration".

Are you using Firefox?

mattw · December 20, 2022

I have tried with Firefox (my normal browser) and with Microsoft Edge. Both seem to yield the same results. BTW, just did a reboot as the only option in the "Array Operations" tab was to reboot or shutdown. I can't run it on my phone, S10+, does not render well enough to trust pushing buttons with my old eyes. This is so frustrating, had this server running for years on 5.0 until my key died and life was in the way and I could not take enough time to troubleshoot it. Then adding cache drive and ram and my lack of abilities got me to this point.

I will quit using Firefox during this process, it must have real issues with Unraid.

After the reboot from Edge, I have the option to start the array and to enable maintenance mode when I do it. The stale config message is now gone.

Edited December 20, 2022 by mattw

trurl · December 20, 2022

26 minutes ago, mattw said:

start the array and to enable maintenance mode

NO

2 hours ago, JorgeB said:

2 hours ago, mattw said:

-Unassign disk3
-Start array (in normal mode now), and post the diagnostics.

mattw · December 20, 2022

Ok, the array is online minus disk 3.

tower-diagnostics-20221220-1552.zip

trurl · December 21, 2022

The next step is check filesystem on emulated disk3. Capture the output and post it.

mattw · December 21, 2022

So, following the above guide...

If the file system is XFS or ReiserFS (but NOT BTRFS), then you must start the array in Maintenance mode, by clicking the Maintenance mode check box before clicking the Start button. This starts the unRAID driver but does not mount any of the drives.

So, I am in maintenance mode as requested in the doc.

You should see a page of options for that drive, beginning with various partition, file system format, and spin down settings. The section following that is the one you want, titled Check Filesystem Status. There is a box with the 2 words Not available in it. This is the command output box, where the progress and results of the command will be displayed. Below that is the Check button that starts the test or repair, followed by the options box where you can type in options for the test/repair command. The options box may already have default options in it, for a read-only check of the file system. For more help, click the Help button in the upper right.

The result of clicking the disk 3 gives me none of the options I would expect to see in the doc. So, with an emulated drive how do I get to the menus I need to see?

image.png.b547b31b31f5ab051d5e12f33bac139c.png

I do see the options on one of my installed and live drives.

trurl · December 21, 2022

Do you know what filesystem is supposed to be on the disk?

mattw · December 21, 2022

Yes, it was xfs.

mattw · December 21, 2022

I should add that files that were on it are not being emulated... they are just not there. Should I just add it as a blank drive at this point?

So, I think I screwed myself tonight, looking for help. 6.11.5+

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

JorgeB

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation