GUI blew up me face :( Now what

DivideBy0 · July 17, 2022

Talking about timing. Right in middle of a parity check and GUI decided to go south on me The good news parity is still running in the background. Any way that I can restart the GUI or just wait for the parity to be over and reboot command line?

I am on 6.9.2 by the way

And what could cause this GUI crash? see attached pic

And get this in syslog:

root@NAS-UNRAID:/var/log# tail -f syslog

Jul 17 18:51:38 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:41 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:46 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:51 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:53 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:56 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Edited July 17, 2022 by johnwhicker

trurl · July 18, 2022

What do you get from command line with this?

df -h

DivideBy0 · July 18, 2022

Well I managed to stop the parity and do a reboot but one of the disks now show disabled?

Should I start my parity check again? and hopefully that will fix the disabled disk?

root@NAS-UNRAID:~# df -h

Filesystem Size Used Avail Use% Mounted on

rootfs 16G 755M 15G 5% /

devtmpfs 16G 0 16G 0% /dev

tmpfs 16G 0 16G 0% /dev/shm

cgroup_root 8.0M 0 8.0M 0% /sys/fs/cgroup

tmpfs 128M 280K 128M 1% /var/log

/dev/sda1 59G 526M 59G 1% /boot

overlay 16G 755M 15G 5% /lib/modules

overlay 16G 755M 15G 5% /lib/firmware

tmpfs 1.0M 0 1.0M 0% /mnt/disks

tmpfs 1.0M 0 1.0M 0% /mnt/remotes

tmpfs 1.0M 0 1.0M 0% /mnt/rootshare

/dev/md1 7.3T 4.4T 3.0T 60% /mnt/disk1

/dev/md2 7.3T 3.2T 4.2T 43% /mnt/disk2

/dev/md3 7.3T 2.6T 4.8T 36% /mnt/disk3

/dev/md4 7.3T 52G 7.3T 1% /mnt/disk4

/dev/md5 7.3T 52G 7.3T 1% /mnt/disk5

/dev/md6 7.3T 52G 7.3T 1% /mnt/disk6

/dev/sdb1 448G 25G 421G 6% /mnt/cache

shfs 44T 11T 34T 24% /mnt/user0

shfs 44T 11T 34T 24% /mnt/user

/dev/loop2 20G 2.8G 17G 14% /var/lib/docker

/dev/loop3 1.0G 3.8M 905M 1% /etc/libvirt

trurl · July 18, 2022

Were those df results taken after reboot?

The disabled disk has to be rebuilt.

Attach diagnostics to your NEXT post in this thread.

DivideBy0 · July 18, 2022

Yes that was the df -h after the reboot

Ok here is the history to what happened and I am sure it was related.

- I created a new share to prep a 1TB transfer from an external SSD drive. I said "Yes to Use cache pool (for new files/directories)"

- Plugged in the SSD drive and start the transfer for the 1TB

- It looks like my cache drive is only 500GB so at some point during the transfer something happened, cache drive got full, and I started having these dynamix errors and GUI crashed.

- I did a command line reboot, it came back and started parity check. During parity check the move script was trying to move from cache to the new share. It did finish moving whatever was on cache, about 480GB. Now cache drive is empty.

- I got the dynamix errors again during the parity check, GUI was messed up, I managed to stop and reboot. When back online, disk 2 was disabled.

- I uninstalled all my dynamix plugins for now, so my GUI stop crashing.

- Now I started a parity check again. I hope after parity check I can rebuild this disk? Data is all there I guess emulated?

What course should I take? Wait for parity check to finish, hopefully no errors and then do a disk rebuild?

Is it possible my parity got messed up during the unsuccessful transfer from the 1TB SSD drive?

Attached diagnostic.

nas-unraid-diagnostics-20220717-1937.zip

Edited July 18, 2022 by johnwhicker

trurl · July 18, 2022

15 minutes ago, johnwhicker said:

Yes that was the df -h after the reboot

Needed to see it when GUI problems were happening, the symptoms might have been due to filling rootfs. If it happens again try the command again.

Emulated disk2 showing more than 3TB of contents so rebuild should be OK.

Since you have dual parity might as well let check complete, hopefully there won't be any sync errors, it isn't using disk2 for the check since it is disabled.

Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share IOT-BKP is set for both included (disk3) and excluded (disk1) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share RAID-1 is set for both included (disk1) and excluded (disk2,disk3,disk4,disk5,disk6) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share TM is set for both included (disk2) and excluded (disk1) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share WIN-BKP is set for both included (disk3) and excluded (disk1) disks ** Ignored

There are no good reasons to set both include and exclude, and there are good reasons to not do that. Your settings aren't even consistent since include means only those disks, and exclude means except for those disks. So for example, that first one says only disk3 should be used, and also says all disks except for disk1 should be used.

DivideBy0 · July 18, 2022

Thanks much. I will clean the shares configs after this.

For sure it was filling rootfs as I seen some rootfs errors on my IPMI console, so that's it.

So after the parity is done I just rebuild disk 2?

- Stop the array

- Set the disk2 to be not installed

- Start the array

- Stop the array

- Set the disk2 to be the appropriate disk

- Start the array

- unRaid will now commence the rebuild operation.

trurl · July 18, 2022

Any path that isn't actual mounted storage is in rootfs, which is in RAM and where the OS lives. If you fill it the OS can't work with its own files anymore.

If you have a docker container with a host path that isn't actual mounted storage, for example, that container could fill rootfs.

Those are the correct steps to rebuild to the same disk.

DivideBy0 · July 18, 2022

Thank you Sir.

I looked at all my docker containers and they all have paths to a /mnt storage point so I should be covered.

Now on the cache issue, how does it work? My share was set to use cache for new files, but the size of the transferred files (1TB) was bigger than the cache (500GB). So when it hits 500GB what does it happen. That's where my crash happened. Is either the way the cache behaved or it may be something with the "DirSyncPro" app I used?

itimpi · July 18, 2022

4 hours ago, johnwhicker said:

Thank you Sir.

I looked at all my docker containers and they all have paths to a /mnt storage point so I should be covered.

Now on the cache issue, how does it work? My share was set to use cache for new files, but the size of the transferred files (1TB) was bigger than the cache (500GB). So when it hits 500GB what does it happen. That's where my crash happened. Is either the way the cache behaved or it may be something with the "DirSyncPro" app I used?

You should have a Minimum Free Space value set for the cache (click on it on the Main tab to get to this setting) to stop it getting completely full. Ideally this value should be something like twice the size of the largest file you expect to write (or larger). When the free space on the cache drops below this value then for subsequent files Unraid will by-pass the cache and start writing directly to the array.

If you do not set a Minimum Free Space setting then Unraid will keep selecting the cache for files and If they will not fit you will get out-of-space type errors occurring.

trurl · July 18, 2022

2 hours ago, johnwhicker said:

paths to a /mnt storage point

Nothing to prevent something creating a path in /mnt that isn't mounted storage. For example, I've seen where people lost cache for some reason, and had a path specified to /mnt/cache, which winds up in rootfs since cache isn't mounted.

DivideBy0 · July 19, 2022

11 hours ago, itimpi said:

You should have a Minimum Free Space value set for the cache (click on it on the Main tab to get to this setting) to stop it getting completely full. Ideally this value should be something like twice the size of the largest file you expect to write (or larger). When the free space on the cache drops below this value then for subsequent files Unraid will by-pass the cache and start writing directly to the array.

If you do not set a Minimum Free Space setting then Unraid will keep selecting the cache for files and If they will not fit you will get out-of-space type errors occurring.

Well when I go to Main and under cache settings and I look at the help menu I get this which is conflict with your statement?

Minimum free space: "This defines a "floor" for the amount of free space remaining in the volume. If the free space becomes less than this value, then new files written via user shares will fail with "not enough space" error." No mention of bypass cache and write directly to array?

Am I looking at the right settings?

Am

trurl · July 19, 2022

7 minutes ago, johnwhicker said:

bypass cache and write directly to array?

That is one of the features of cache:yes and cache:prefer user shares. For those user shares, if the specified pool (cache) has less than minimum, new files will overflow to the array.

itimpi · July 19, 2022

4 hours ago, trurl said:

That is one of the features of cache:yes and cache:prefer user shares. For those user shares, if the specified pool (cache) has less than minimum, new files will overflow to the array.

I guess that help text could do with an update to make this clear. I think it is mentioned on the Share Settings page but not here.

Personally I would like the default for pools/cache to not be 0 so new users are less likely to get bitten by this issue, but that is a different conversation.

Kilrah · July 19, 2022

3 minutes ago, itimpi said:

I guess that help text could do with an update to make this clear.

That's explained in the share configuration where you have the different options like cache:yes since that's where it matters. As fas as I understand setting a minimum free space on the cache drive itself will not cause an overflow of shares that aren't configured to Cache:Yes.

itimpi · July 19, 2022

1 minute ago, Kilrah said:

That's explained in the share configuration where you have the different options like cache:yes since that's where it matters. As fas as I understand setting a minimum free space on the cache drive itself will not cause an overflow of shares that aren't configured to Cache:Yes.

I agree -but there is no reason that help text cannot mention overflow to the array with correct share settings (if just for consistency).

Kilrah · July 19, 2022

But would the drive's Minimum Free space setting trigger an overflow on those that do even if the share's minimum free space is not configured/configured lower?

itimpi · July 19, 2022

4 minutes ago, Kilrah said:

But would the drive's Minimum Free space setting trigger an overflow on those that do even if the share's minimum free space is not configured/configured lower?

Yes. The share level setting is designed to handle the case of array drives getting full, not a pool/cache getting full.

In the past there was a time when the higher of the share level setting and the pool setting was (incorrectly I believe) applied to the pool/cache but I think this is no longer the case.

trurl · July 19, 2022

5 hours ago, Kilrah said:

will not cause an overflow of shares that aren't configured to Cache:Yes.

cache:prefer also overflows

DivideBy0 · July 19, 2022

Let me see if I got this right. On my share before transferring the 1TB my settings was Yes : Cache for that new share.

But when it hit the max of 500GB of my cache everything crashed and my GUI blew off. For sure I've seen some errors in my IPMI console about the cache full or something like that, so it did not spill over to the array.

Is that because I had nothing set under the Main -> Pool Device -> Cache -> Minimum free space: 0 ?

I have a backup UnRaid and will do a test, because this entire experience hosed my system and ain't good

I really appreciate the support and guidance

trurl · July 19, 2022

17 minutes ago, johnwhicker said:

Is that because I had nothing set under the Main -> Pool Device -> Cache -> Minimum free space: 0 ?

That could definitely cause cache to fillup, and btrfs especially is prone to corruption when you fill it.

Shouldn't cause crash.

DivideBy0 · July 20, 2022

1 hour ago, trurl said:

That could definitely cause cache to fillup, and btrfs especially is prone to corruption when you fill it.

Shouldn't cause crash.

Makes sense. Currently I have appdata, domains and system in cache for performance reasons. I can put them back on the array. Should I reformat the cache or repair it? In case something got corrupted I prefer to fix it now before data start moving through the cache. What do you advise?

root@NAS-UNRAID:/mnt/cache# ls -al

total 16

drwxrwxrwx 1 nobody users 40 Jul 19 12:39 ./

drwxr-xr-x 14 root root 280 Jul 19 12:29 ../

drwxrwxrwx 1 nobody users 162 Jul 9 19:24 appdata/

drwxrwxrwx 1 nobody users 0 May 26 2020 domains/

drwxrwxrwx 1 nobody users 26 May 26 2020 system/

Edited July 20, 2022 by johnwhicker

trurl · July 20, 2022

Not clear you actually have corruption, and diagnostics was days ago

DivideBy0 · July 20, 2022

12 hours ago, trurl said:

Not clear you actually have corruption, and diagnostics was days ago

Understand. But what path should I take just preventively? Just to be sure. I don't mind blowing ups my cache and reformatting

trurl · July 20, 2022

If syslog doesn't say there is any corruption I would just keep going.

And don't fill cache or any other disk.

You do have backups of anything important and irreplaceable?

GUI blew up me face :( Now what

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation