DieFalse

November 3, 2021

4 minutes ago, ThatDude said:

I'm not sure if this is a bug / a quirk or a coincidence. After upgraded from RC1 (which ran fine) one of my drives has been marked 'unmountable unsupported partition layout'. But it's still in it's slot (Disk 1 - see attached) and is not being emulated which is what I would have expected. Also the array isn't showing as degraded - is that normal?

Definitely start a support thread and post your diagnostics in it.

November 2, 2021

1 hour ago, limetech said:

Could be. Nothing in 'stock' Unraid OS requires libgd so we would not have noticed if an updated package removed it.

We can add it - what about 'vnstat' package? Is this useful to add to Unraid OS?

VNSTAT is needed for Network Statistics to function correctly and it is not currently startable on my servers: "vnstat service must be running STARTED to view network stats." Please keep it.

"root@Arcanine:~# vnstat
Error: Database "/var/lib/vnstat//vnstat.db" contains 0 bytes and isn't a valid database, exiting."

root@Arcanine:~# vnstat
Error: Failed to open database "/var/lib/vnstat//vnstat.db" in read-only mode.
The vnStat daemon should have created the database when started.
Check that it is configured and running. See also "man vnstatd".

root@Arcanine:~# vnstatd -d
Error: Not enough free diskspace available in "/var/lib/vnstat/", exiting.

October 13, 2021

~~I have been up for 24 hours on Version: 6.10.0-rc2g without the issue so far. Will report any changes.~~

This still occurs in rc2g

October 4, 2021

Problem still exists in Version: 6.10.0-rc2f and still resolves by issuing the conntrack max command.

September 7, 2021

On 9/3/2021 at 5:00 AM, danioj said:

Just checking in.

Uptime is now 3 days 13 hours 9 minutes since I last issued the netfilter fix command.

Not one call trace in the log or hard lock up / crash since.

Hi @danioj I wanted to check in and see if you still had zero traces

September 2, 2021

18 hours ago, danioj said:

I have to openly admit that I do not have the technical insights as to how it works. What I can share is what I experienced.

The server was stable ever since I issued the command initially. I had to shut off the server as I had an electrician in to install some smart light switches and we had to cut the power.

When I turned back on - well, 5 hours after I turned back on - the server crashed.

I hard reset and went back in an issued the same above command and it's been stable again ever since. I concluded from that (and it will be interesting to see if I get a crash again over the coming week - noting they did come daily previously), that the command has to keep being issued.

Thank you for detailing this. I am looking for extra information on how this solves the issue to begin with and how it reacts for others. Hopefully someone better than I can chime in and expand.

September 1, 2021

Almost any use of setting it within google search came back with 131072. The start of my spiral was after reading this https://github.com/kubernetes-sigs/kind/issues/2240

Your link seems to be a more in-depth fix that could be better than my temp fix that I got lucky enough for it to work.

September 1, 2021

2 hours ago, danioj said:

I have just tested this. The command does not survive reboots.

Without re-issuing the command, after a reboot, have you had the call trace? The reason I ask, is because the number will change, as designed, so verifying it with cat /proc/sys/net/netfilter/nf_conntrack_max or similar is not valid.

September 1, 2021

I can state that I set it once, it survived many reboots and now upgrade to 6.10-rc2d

August 31, 2021

14 hours ago, danioj said:

Update since I applied this "Fix" - no call traces in the log or hard crashes yet.

If this works I will probably need to add the command to a user script to execute on array start.

You should not need to, It is a once and done command UNLESS something in the system sets the conntrack too high which shouldn't happen and hasn't since Kernel 5.12.2 https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.12.2

Since this netfilter conntrack is working as of now, the limitation for the nic was a false flag. I don't know what NIC's you are using but some consumer level ones do not like multiple mac addresses and fail as low as two. creating a virtual vlan or br0 ip etc creates new virtual MAC addresses and the card can only handle so much. Example of a enterprise card: "Many Mellanox adapters are capable of exposing up to 127 virtual instances." Consumer card: "Realtek 1Gb NICs are often limited to 6-12 virtual instances"

Looking at your config, you have I believe 7 instances on one card and 1 on the other which is controlled by a Intel® i210AT dual port 1Gb integrated, which is limited to 5 vectors (instances) per port, so it is technically over limit - HOWEVER you're not assigning IP's to the VLANS and I believe this stops them from being true virtual instances, so theoretically it should be fine.

Give it some more time, but please advise if you experience any call trace and if so, post diagnostics with the syslog please. I don't expect you to have any though.

August 30, 2021

Ok - awaiting your reply on testing. I am 100% seeing netfilter call trace cause. your expierence on IPVLAN is expected, and you can overcome this by properly building your docker network with creating custom networks and some other config. "host access being enabled" will cause issues and is generally not advised, it would take you some time to correct your config not to need it. But - that's a different thread.

August 30, 2021

On 8/28/2021 at 9:15 PM, danioj said:
Update. I had another hard lock up over night. Same issues in the log.

Tried the fix linked to me by @ljm42 above:
sysctl net/netfilter/nf_conntrack_max=131072
Let's see how it goes.

I reviewed your logs and you are experiencing call traces due to your networking adapter, however it appears you are somehow reaching the limit of your NIC. If the fix above doesn't work, splitting between a couple NICs may, or upgrading your existing NIC or it's firmware even. I didn't review the logs enough to find the systems hardware as it's late, so I will relook tomorrow and let you know if I can see anything deeper.

August 27, 2021

On Aug 20th, I posted the above due to debugging seeming to point to nVidia, however post troubleshooting in depth it was determined netfilter was causing the call traces.

I was asked to try "ipvlan" instead of "macvlan" - this made no change, so reverted back to macvlan.

:: Place holder for details and oulying the issue, original values etc :: :: at work so limited on what I can pull, will edit to add later ::

I have since, after reviewing other similar call traces, found reference to setting the conn track max in an effort to resolve the call traces. just over 36 hours ago I made the following change: "sysctl net/netfilter/nf_conntrack_max=131072" in terminal and verified it with "cat /proc/sys/net/netfilter/nf_conntrack_max" showing the new value of 131072. I have not had a single call trace since.

TLDR: setting this

sysctl net/netfilter/nf_conntrack_max=131072

stopped my call traces.

If anyone knows how to help me gather what's needed to see why this stopped the call traces and prevent them from happening to others - please assist.

August 24, 2021

Tracking this down with discord assistance, it was determined that the servers was blocking/routing incorrectly due to Jumbo Frames on the NICs. when the MTU is higher than 1500 on the NICs no web access across the network was possible, yes jumbo frames is correctly configured on the switch's and router, and jumbo frames worked on <6.9.2. When the MTU is set to 1500, WebUI loads correctly.

August 23, 2021

I continue to have problems with GSA and Arcanine. Arcanine became completely unresponsive last night. GSA is SSH'able but nothing will run correct.y

Neither is unusable in its state, so I will be forced to revert to 6.9.2 soon to bring them back online, I would like to give all information I can to help resolve this. Please let me know what to provide. I can even provide remote access if you want to PM me.

August 22, 2021

17 hours ago, fmp4m said:

I will restore via instructions this evening. The btrfs issues were resolved in another thread that lead to me trying 6.10 as the call traces on previous versions. Leading me to here. I'll advise when both are 100% stock and then upgrade to 6.10rc1 again and see if any difference is noticed.

Ok, Arcanine is on 6.10rc1 with no issues as of this morning.

GSA I could not get into it in any way other than SSH as previously mentioned, so it still has the issues originally reported. Except error 500 on loading the webui instead of 302.

I used use_ssl no and got into the web ui, call traces are still heavily happening.

The kernel files should all be stock now.

I since I had previously downgraded, manually copied 6.9.2 to root, rebooted, then upgraded to 6.10rc1 through UI and verified bz files matched the downloaded zip for 6.10rc1

gsa-diagnostics-20210822-1057.zip

August 21, 2021

15 minutes ago, ljm42 said:

gsa has the same plugin installed, so I'd remove that and restore it to stock as well.

gsa has a bunch of btrfs mentions in the log, not sure if that is an issue?

Aug 18 21:45:17 GSA kernel: BTRFS info (device sdb1): relocating block group 43333096308736 flags data
Aug 18 21:45:22 GSA kernel: BTRFS info (device sdb1): found 9 extents, stage: move data extents
Aug 18 21:45:22 GSA kernel: BTRFS info (device sdb1): found 9 extents, stage: update data pointers
Aug 18 21:45:23 GSA kernel: BTRFS info (device sdb1): relocating block group 43332022566912 flags data
Aug 18 21:45:28 GSA kernel: BTRFS info (device sdb1): found 8 extents, stage: move data extents
Aug 18 21:45:28 GSA kernel: BTRFS info (device sdb1): found 8 extents, stage: update data pointers
Aug 18 21:45:29 GSA kernel: BTRFS info (device sdb1): relocating block group 43330948825088 flags data
Aug 18 21:45:33 GSA kernel: BTRFS info (device sdb1): found 9 extents, stage: move data extents
Aug 18 21:45:34 GSA kernel: BTRFS info (device sdb1): found 9 extents, stage: update data pointers
Aug 18 21:45:35 GSA kernel: BTRFS info (device sdb1): relocating block group 43329875083264 flags data
Aug 18 21:45:40 GSA kernel: BTRFS info (device sdb1): found 8 extents, stage: move data extents
Aug 18 21:45:41 GSA kernel: BTRFS info (device sdb1): found 8 extents, stage: update data pointers

I will restore via instructions this evening. The btrfs issues were resolved in another thread that lead to me trying 6.10 as the call traces on previous versions. Leading me to here. I'll advise when both are 100% stock and then upgrade to 6.10rc1 again and see if any difference is noticed.

August 21, 2021

1 minute ago, ljm42 said:

arcanine has the Unraid-Kernel-Helper.plg plugin installed, and the bzfirmware file is 20MB whereas the stock 6.10.0-rc1 file is 10MB.

Does that plugin have an option to return to stock? If so run that and then uninstall the plugin.

Unfortunately, nothing else is really standing out to me. Hopefully somebody else will see something.

That plugin has been removed, the behaviour exists on GSA and Arcanine both. There was no "revert" option so I assume on Arcanine if thats the only one with bad .bz, I will need to do something to revert?

I thought when you run upgrade it installs the latest .bz, so that makes me think something would have to modify the .bz post upgrade?

August 21, 2021

No, I haven't had a non-stock kernel on GSA ever, Arcanine used to for mellanox and something else but it went to stock over 3 releases ago.

How do I restore .bz to stock if it's modified?

August 20, 2021

34 minutes ago, ljm42 said:

> curl localhost results in 302 error, so I am assuming this has to do with the dns redirect that unraid uses to the hash url? (is there anyway to disable that since I use my own dns and hostnames with valid certs anyway and it would be easy to configure).

ssh into the server and type:

use_ssl no

This is the equivalent of going to Settings -> Management Access and setting Use SSL to No. You will then be able to access the webgui using:

http://ipaddress (note: http not https)

Thank you - I will try that to see if I can get into the WebUI but the hangs / inability to maneuver anything even in ssh is worrisome. Anything in diagnostics useful?

July 28, 2020

Hi JonP sorry for the delay, I checked and no, I have over 2TB available. Cache was balanced and scrubbed. I have no idea why it can't be recreated. There may be something wrong with my setup - I've posted a couple issues that got worked through and a couple untouched in general, however I have had time-outs occur when updating dockers and dockers disappearing because of it, as well as really sluggish web gui responses in general. Can't seem to nail it down at this point - and generic posts tend to yield no results. Maybe its related to my system only?

July 22, 2020

New ISO created out of the Media Creation Tool - Wanted to make sure it had all the latest patches. I don't keep a cached copy.

July 19, 2020

Using virtio instead of Sata did work for WIndows 10. Using raw with linux worked. Odd things, but I can confirm I am good now.

July 17, 2020

On my servers

Vdisk1.img = 56kb in size.

July 16, 2020

Sorry, thought I mentioned in the post,

Either raw or qcow does the same for me. On all 3 servers I tried. Two Intel, one amd.

DieFalse

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Report Comments posted by DieFalse

Unraid OS version 6.10.0-rc2 available

Unraid OS version 6.10.0-rc2 available

[6.9.2-6.10-rc1] netfilter causing call traces

[6.9.2-6.10-rc1] netfilter causing call traces

[unRAID 6.10.0-rc1] - Seemingly random crashes

[unRAID 6.10.0-rc1] - Seemingly random crashes

[6.9.2-6.10-rc1] netfilter causing call traces

[unRAID 6.10.0-rc1] - Seemingly random crashes

[unRAID 6.10.0-rc1] - Seemingly random crashes

[unRAID 6.10.0-rc1] - Seemingly random crashes

[unRAID 6.10.0-rc1] - Seemingly random crashes

[unRAID 6.10.0-rc1] - Seemingly random crashes

[6.9.2-6.10-rc1] netfilter causing call traces

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.10-RC1 Unable to access WebUI post update on two servers

6.9.0-beta25 Creation of New VM yields drive space of 0mb for client OS

6.9.0-beta25 Creation of New VM yields drive space of 0mb for client OS

6.9.0-beta25 Creation of New VM yields drive space of 0mb for client OS

6.9.0-beta25 Creation of New VM yields drive space of 0mb for client OS

6.9.0-beta25 Creation of New VM yields drive space of 0mb for client OS