Jump to content
We're Hiring! Full Stack Developer ×

GUI blew up me face :( Now what


Recommended Posts

Talking about timing.  Right in middle of a parity check and GUI decided to go south on me :(  The good news parity is still running in the background.  Any way that I can restart the GUI or just wait for the parity to be over and reboot command line?

 

I am on 6.9.2 by the way

 

And what could cause this GUI crash? see attached pic

 

And get this in syslog:

 

 

root@NAS-UNRAID:/var/log# tail -f syslog

Jul 17 18:51:38 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:38 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:41 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:46 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:46 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:46 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:51 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:53 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:53 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

Jul 17 18:51:56 NAS-UNRAID root: error: /webGui/include/DashUpdate.php: uninitialized csrf_token

 

 

GUI-UNRAID.thumb.png.9de45ca96e1a809cc5b5806c42f75878.png

Edited by johnwhicker
Link to comment

Well I managed to stop the parity and do a reboot but one of the disks now show disabled? 

 

Should I start my parity check again? and hopefully that will fix the disabled disk? 

 

DISABLED.thumb.png.8e4219326c543a34dbd5fa164f35c86d.png

 

root@NAS-UNRAID:~# df -h

Filesystem      Size  Used Avail Use% Mounted on

rootfs           16G  755M   15G   5% /

devtmpfs         16G     0   16G   0% /dev

tmpfs            16G     0   16G   0% /dev/shm

cgroup_root     8.0M     0  8.0M   0% /sys/fs/cgroup

tmpfs           128M  280K  128M   1% /var/log

/dev/sda1        59G  526M   59G   1% /boot

overlay          16G  755M   15G   5% /lib/modules

overlay          16G  755M   15G   5% /lib/firmware

tmpfs           1.0M     0  1.0M   0% /mnt/disks

tmpfs           1.0M     0  1.0M   0% /mnt/remotes

tmpfs           1.0M     0  1.0M   0% /mnt/rootshare

/dev/md1        7.3T  4.4T  3.0T  60% /mnt/disk1

/dev/md2        7.3T  3.2T  4.2T  43% /mnt/disk2

/dev/md3        7.3T  2.6T  4.8T  36% /mnt/disk3

/dev/md4        7.3T   52G  7.3T   1% /mnt/disk4

/dev/md5        7.3T   52G  7.3T   1% /mnt/disk5

/dev/md6        7.3T   52G  7.3T   1% /mnt/disk6

/dev/sdb1       448G   25G  421G   6% /mnt/cache

shfs             44T   11T   34T  24% /mnt/user0

shfs             44T   11T   34T  24% /mnt/user

/dev/loop2       20G  2.8G   17G  14% /var/lib/docker

/dev/loop3      1.0G  3.8M  905M   1% /etc/libvirt

Link to comment

Yes that was the df -h after the reboot

 

Ok here is the history to what happened and I am sure it was related.

 

- I created a new share to prep a 1TB transfer from an external SSD drive. I said "Yes to  Use cache pool (for new files/directories)"

- Plugged in the SSD drive and start the transfer for the 1TB

- It looks like my cache drive is only 500GB so at some point during the transfer something happened, cache drive got full, and I started having these dynamix errors and GUI crashed.

- I did a command line reboot, it came back and started parity check. During parity check the move script was trying to move from cache to the new share. It did finish moving whatever was on cache, about 480GB. Now cache drive is empty.

- I got the dynamix errors again during the parity check, GUI was messed up, I managed to stop and reboot. When back online, disk 2 was disabled. 

- I uninstalled all my dynamix plugins for now, so my GUI stop crashing.

- Now I started a parity check again. I hope after parity check I can rebuild this disk?  Data is all there I guess emulated?

 

What course should I take? Wait for parity check to finish, hopefully no errors and then do a disk rebuild? 

Is it possible my parity got messed up during the unsuccessful transfer from the 1TB SSD drive? 

 

Attached diagnostic.

nas-unraid-diagnostics-20220717-1937.zip

Edited by johnwhicker
Link to comment
15 minutes ago, johnwhicker said:

Yes that was the df -h after the reboot

Needed to see it when GUI problems were happening, the symptoms might have been due to filling rootfs. If it happens again try the command again.

 

Emulated disk2 showing more than 3TB of contents so rebuild should be OK.

 

Since you have dual parity might as well let check complete, hopefully there won't be any sync errors, it isn't using disk2 for the check since it is disabled.

 

Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share IOT-BKP is set for both included (disk3) and excluded (disk1) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share RAID-1 is set for both included (disk1) and excluded (disk2,disk3,disk4,disk5,disk6) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share TM is set for both included (disk2) and excluded (disk1) disks ** Ignored
Jul 17 19:24:01 NAS-UNRAID root: Fix Common Problems: Warning: Share WIN-BKP is set for both included (disk3) and excluded (disk1) disks ** Ignored

There are no good reasons to set both include and exclude, and there are good reasons to not do that. Your settings aren't even consistent since include means only those disks, and exclude means except for those disks. So for example, that first one says only disk3 should be used, and also says all disks except for disk1 should be used.

 

 

 

 

Link to comment

Thanks much. I will clean the shares configs after this. 

 

For sure it was filling rootfs as I seen some rootfs errors on my IPMI console, so that's it. 

 

So after the parity is done I just rebuild disk 2?

- Stop the array

- Set the disk2 to be not installed

- Start the array

- Stop the array

- Set the disk2 to be the appropriate disk

- Start the array

- unRaid will now commence the rebuild operation.

Link to comment

Any path that isn't actual mounted storage is in rootfs, which is in RAM and where the OS lives. If you fill it the OS can't work with its own files anymore.

 

If you have a docker container with a host path that isn't actual mounted storage, for example, that container could fill rootfs.

 

Those are the correct steps to rebuild to the same disk.

Link to comment

Thank you Sir.

 

I looked at all my docker containers and they all have paths to a /mnt storage point so I should be covered.

 

Now on the cache issue, how does it work?  My share was set to use cache for new files, but the size of the transferred files (1TB) was bigger than the cache (500GB).   So when it hits 500GB what does it happen. That's where my crash happened.  Is either the way the cache behaved or it may be something with the "DirSyncPro" app I used?

Link to comment
4 hours ago, johnwhicker said:

Thank you Sir.

 

I looked at all my docker containers and they all have paths to a /mnt storage point so I should be covered.

 

Now on the cache issue, how does it work?  My share was set to use cache for new files, but the size of the transferred files (1TB) was bigger than the cache (500GB).   So when it hits 500GB what does it happen. That's where my crash happened.  Is either the way the cache behaved or it may be something with the "DirSyncPro" app I used?

You should have a Minimum Free Space value set for the cache (click on it on the Main tab to get to this setting) to stop it getting completely full.   Ideally this value should be something like twice the size of the largest file you expect to write (or larger).    When the free space on the cache drops below this value then for subsequent files Unraid will by-pass the cache and start writing directly to the array.    
 

If you do not set a Minimum Free Space setting then Unraid will keep selecting the cache for files and If they will not fit you will get out-of-space type errors occurring.

Link to comment
2 hours ago, johnwhicker said:

paths to a /mnt storage point

Nothing to prevent something creating a path in /mnt that isn't mounted storage. For example, I've seen where people lost cache for some reason, and had a path specified to /mnt/cache, which winds up in rootfs since cache isn't mounted.

Link to comment
11 hours ago, itimpi said:

You should have a Minimum Free Space value set for the cache (click on it on the Main tab to get to this setting) to stop it getting completely full.   Ideally this value should be something like twice the size of the largest file you expect to write (or larger).    When the free space on the cache drops below this value then for subsequent files Unraid will by-pass the cache and start writing directly to the array.    
 

If you do not set a Minimum Free Space setting then Unraid will keep selecting the cache for files and If they will not fit you will get out-of-space type errors occurring.

 

Well when I go to Main and under cache settings and I look at the help menu I get this which is conflict with your statement?

 

Minimum free space: "This defines a "floor" for the amount of free space remaining in the volume. If the free space becomes less than this value, then new files written via user shares will fail with "not enough space" error." No mention of bypass cache and write directly to array? 

 

Am I looking at the right settings?

 

259784560_ScreenShot2022-07-18at9_05_19PM.thumb.png.4e658f902b5f9b589428793c2bd83eeb.pngAm 

 

Link to comment
4 hours ago, trurl said:

That is one of the features of cache:yes and cache:prefer user shares. For those user shares, if the specified pool (cache) has less than minimum, new files will overflow to the array.

I guess that help text could do with an update to make this clear.   I think it is mentioned on the Share Settings page but not here.

 

Personally I would like the default for pools/cache to not be 0 so new users are less likely to get bitten by this issue, but that is a different conversation.

Link to comment
3 minutes ago, itimpi said:

I guess that help text could do with an update to make this clear.

That's explained in the share configuration where you have the different options like cache:yes since that's where it matters. As fas as I understand setting a minimum free space on the cache drive itself will not cause an overflow of shares that aren't configured to Cache:Yes.

Link to comment
1 minute ago, Kilrah said:

That's explained in the share configuration where you have the different options like cache:yes since that's where it matters. As fas as I understand setting a minimum free space on the cache drive itself will not cause an overflow of shares that aren't configured to Cache:Yes.

I agree -but there is no reason that help text cannot mention overflow to the array with correct share settings (if just for consistency).

Link to comment
4 minutes ago, Kilrah said:

But would the drive's Minimum Free space setting trigger an overflow on those that do even if the share's minimum free space is not configured/configured lower?

Yes.   The share level setting is designed to handle the case of array drives getting full, not a pool/cache getting full.

 

In the past there was a time when the higher of the share level setting and the pool setting was (incorrectly I believe) applied to the pool/cache but I think this is no longer the case.

  • Like 1
Link to comment

Let me see if I got this right. On my share before transferring the 1TB my settings was Yes : Cache for that new share.

 

But when it hit the max of 500GB of my cache everything crashed and my GUI blew off. For sure I've seen some errors in my IPMI console about the cache full or something like that, so it did not spill over to the array.

 

Is that because I had nothing set under the Main -> Pool Device -> Cache -> Minimum free space: 0   ?

 

I have a backup UnRaid and will do a test, because this entire experience hosed my system and ain't good :) 

 

I really appreciate the support and guidance

Link to comment
17 minutes ago, johnwhicker said:

Is that because I had nothing set under the Main -> Pool Device -> Cache -> Minimum free space: 0   ?

That could definitely cause cache to fillup, and btrfs especially is prone to corruption when you fill it.

 

Shouldn't cause crash.

Link to comment
1 hour ago, trurl said:

That could definitely cause cache to fillup, and btrfs especially is prone to corruption when you fill it.

 

Shouldn't cause crash.

 

Makes sense.  Currently I have  appdata, domains and system in cache for performance reasons.  I can put them back on the array.  Should I reformat the cache or repair it? In case something got corrupted I prefer to fix it now before data start moving through the cache.  What do you advise?

 

 

root@NAS-UNRAID:/mnt/cache# ls -al

total 16

drwxrwxrwx  1 nobody users  40 Jul 19 12:39 ./

drwxr-xr-x 14 root   root  280 Jul 19 12:29 ../

drwxrwxrwx  1 nobody users 162 Jul  9 19:24 appdata/

drwxrwxrwx  1 nobody users   0 May 26  2020 domains/

drwxrwxrwx  1 nobody users  26 May 26  2020 system/

Edited by johnwhicker
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...