[6.12.10] Cannot fork: Resource temporarily unavailable

JorgeB · June 10

Enable the syslog server and post that if it happens again.

Hendrik112 · June 11

And it happened again, same as before.

Here are the logs:

syslog-192.168.1.10.log

JorgeB · June 12

Jun 11 21:55:38 Anton emhttpd: Unregistered Flash device error (ENOFLASH7)
Jun 11 21:55:40 Anton emhttpd: Plus key detected, FILE: /boot/config/Plus.key
Jun 11 21:56:05 Anton emhttpd: Unregistered Flash device error (ENOFLASH7)
Jun 11 21:56:06 Anton emhttpd: Plus key detected, FILE: /boot/config/Plus.key
Jun 11 21:56:08 Anton emhttpd: Unregistered Flash device error (ENOFLASH7)
Jun 11 21:56:10 Anton emhttpd: Plus key detected, FILE: /boot/config/Plus.key

You are having issues with the flash drive, try using a different USB port, ideally USB 2.0, if the issue remains try a different flash drive.

Hendrik112 · June 12

This board only has USB 3.2 Gen1 Ports. I have been using that port for about 5 months now, and the flash drive has been running unraid for 5 years.

I will try replacing the flash drive and hope, that someday we will be able to properly install unraid.

Hendrik112 · June 14

After replacing the flash drive, it took about 1.5 days until the server froze up completely again.

This is the syslog from the whole time period:

I will bump this to urgent now as this keeps happening and is rendering my server completely unusable.

FYI, I have not performed any BIOS updates or anything else which could have caused this. The only thing I did was adding two more HDDs about a week ago.

syslog-192.168.1.10 (1).log

Hendrik112 · June 14

Changed Status to Open

Changed Priority to Urgent

JorgeB · June 14

The flash drive is still disappearing, but there are no associated USB error errors, so that's strange, could be a config issue, or a hardware issue, first I would redo the flash drive and restore just the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if all works you can then reconfigure the server or try restoring a few config files at a time from the backup to see if you can find the culprit, if that doesn't help look in the board BIOS for any USB related settings and try toggling those.

Hendrik112 · June 18

Bad news, I did the following:
- Fresh install using unraid usb creator 2.1 only copying the following configurations:
   - Plus.key
   - super.dat
   - pools dir
   - docker user template (only Portainer)
   - docker.cfg (using custom macvlan docker network)
   - netowork.cfg (using vlans)
   - Running only the following compose stacks:
       - Traefik without uptime-kuma
       - Authelia
       - Jellyfin without nvidia runtime
       - Nextcloud
   - Set disk spin down to 3h
   - Configure shares to caches
   - Enable syslog to local

After an uptime of about 2 days and 21 hours, the system was displaying the same error. This time I was able to run the diagnostics command from the CLI. After rebooting, there isn't even a logs folder on the flash drive.

I will check the UEFI settings next, but I haven't changed anything before this problem occurred. I would be surprised if there suddenly was an issue with the CPU, RAM, or MB. That seems really unlikely to me. One thing that could potentially be causing this is the PSU, it's good but old, and I am not sure about the fan.

syslog-192.168.1.10 (2).log

JorgeB · June 18

Still logging flash drive issues.

Hendrik112 · June 18

Jun 17 23:15:35 Anton emhttpd: Unregistered Flash device error (ENOFLASH7)
Jun 17 23:15:35 Anton emhttpd: read SMART /dev/sdh
Jun 17 23:15:35 Anton emhttpd: error: device_read_smart, 9552: Cannot allocate memory (12): device_spinup: stream did not open: sdh

There is also a Cannot allocate memory error being logged at the same time. Depending on the unraid logging backend, which I'm really not familiar with, the order of events might not be accurate. Restoring all the other configs now and running memtest for a day or two. Also checked the MB USB settings, no idea what they do, everything is still set to auto and there are no descriptions provided.

Edited June 18 by Hendrik112

Hendrik112 · June 20

I was able to get the diagnostics saved this time around.

Switched to a brand-new seasonic power supply and basically rewired the whole system.

The RAM has no issues, it was unlikely anyway. Orignal Samsung ECC RAM which I already tested with the latest memtest version. I guess I will now just go backwards through the unraid versions and see if that changes something.

I remembered updating the OS before adding two more disks.

anton-diagnostics-20240620-0646.zip

JorgeB · June 20

1 hour ago, Hendrik112 said:

I will now just go backwards through the unraid versions and see if that changes something.

Worth a try, and also don't think this is RAM related.

Hendrik112 · June 20

Where can I download Unraid 6.12.8? That's the last version I used (skipped 6.12.9).

Not showing in the USB Creator Tool also not available in the download archive.

The new documentation is completely useless btw. The headline states "Manual upgrade or downgrade", but the subsections don't mention downgrading at all. Link
Links to the docs from other forum posts don't work or point to a random article including yours from your first response.

Edited June 20 by Hendrik112

JorgeB · June 20

The manual option can be used to upgrade and downgrade.

v6.12.8 attached.

Hendrik112 · June 22

Sadly, downgrading to 6.12.8 didn't help either.

Since 12.8 was running fine for months, I think it doesn't make any sense downgrading further.

I am now looking into USB signaling issues. I switched the USB to the front panel ports (also 3.0), which are connected to the X470 chipset instead of directly to the CPU.

Is there any way the support can reactivate an already used flash drive in case I have to try another one?

itimpi · June 22

43 minutes ago, Hendrik112 said:

Is there any way the support can reactivate an already used flash drive in case I have to try another one?

Never heard of this being possible, but the only way to be certain would be to contact support,

JorgeB · June 23

16 hours ago, Hendrik112 said:

Since 12.8 was running fine for months, I think it doesn't make any sense downgrading further.

I agree, and it suggest the OS is not the problem, you could try recreating the flash drive from stock and restoring only the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if you still get the flash drive related errors in the log after that, it basically confirms it's a hardware issue.

Hendrik112 · June 23

1 minute ago, JorgeB said:

I agree, and it suggest the OS is not the problem, you could try recreating the flash drive from stock and restoring only the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if you still get the flash drive related errors in the log after that, it basically confirms it's a hardware issue.

You already suggested that, and I already did that like a week ago. Unless I am missing something?

Hendrik112 · June 23

I searched the unraid forums as well as the unraid official discord for the ENOFLASH7 error and categorized them.

In conclusion, most posts did not contain a solution.

The slowly failing system is consistent through the posts.

The issue seems to be present in all 6.11 to 6.12 unraid versions.

Nobody was able to point out what exactly caused the issues.

One user replaced their usb + flash drive, seemingly without ever resolving the issue. While others just recreated their flash drive and were fine.

Fixed by disabling C-Staes in Ryzen CPUs (will try next):
- https://forums.unraid.net/topic/100870-solved-multiple-errors-unregistered-flash-device-error-and-fork-errors/

Fixed by stopping shinobi / stopping failing docker container (checking):
- https://forums.unraid.net/topic/122636-solved-server-freeze-every-morning-requires-hard-reboot-fork-resource-unavailable/

- https://forums.unraid.net/topic/138316-help-solve-repeatedly-failing-unraid-server/

Probably fixed by recreating the flash drive:
- https://forums.unraid.net/topic/132557-got-on-this-am-to-a-mostly-hung-server/

- https://forums.unraid.net/topic/134416-unregistered-flash-device-error-enoflash7/

No response from author:
- https://forums.unraid.net/topic/146773-6124-cannot-allocate-memory-error-system-half-hanging/
- https://forums.unraid.net/topic/116184-enoflash7-everyday/
- https://forums.unraid.net/topic/125302-enoflash7-in-random-intervals/
- https://forums.unraid.net/topic/141213-unraid-612-vm-crashing-enoflash7/
- https://forums.unraid.net/topic/135679-periodic-freeze-until-reboot-unregistered-key-detected-unregestered/
- https://forums.unraid.net/topic/118573-display-”no-flash“-for-the-first-ten-minutes/

Edited June 23 by Hendrik112

JorgeB · June 23

1 hour ago, Hendrik112 said:

You already suggested that, and I already did that like a week ago. Unless I am missing something?

You restored some more stuff, it was just to be sure it wasn't for example a container causing issues, restore the minimum and run it without any containers for a couple of days, but I really doubt that is the problem.

Hendrik112 · June 23

Sure, that's worth a try.

Now that I am thinking about it, it's actually not that unlikely, since docker.cfg actually sets the docker storage to "directory" which will use the docker zfs storage driver if the share is a zfs pool.

The docker zfs storage driver could very well cause these issues.

The system should fail again tomorrow anyway, so I can try that.

Hendrik112 · June 24

Hehe, I figured it out

This is not a hardware issue!

Unraid OS is vulnerable to docker fork bombs. From my limited understanding of how the Linux kernel and docker work, this should usually be prevented by the cgroups pids limit.

From the number of related issues to this problem and people throwing away perfectly fine flash drives, I see a strong case to further investigate this issue and fix it in the OS. One guy even replaced his mainbaord just because of this. The container mentioned below is not the first nor will it be the last to slowly fork bomb a system.

How I was finally able to find the root cause of this.

After roughly 30h of uptime I ran the docker stats command which revealed that the authelia container was using 3318 PIDs.

The number of PIDs was also steadily increasing (every 10s). Which turned out to be the health check interval, which I modified from 30s to 10s.

The authelia docker container, if it has been configured with TLS certs, will cause the docker health check script using the alpine wget command to create zombie processes. After checking with James, one of the lead devs of authelia if this could cause system crashes, I decreased the health check interval to 0.1s. This greatly speeded up the process and caused my system to freeze after around 1.5h instead of days. This behavior is obviously not intended by authelia and will soon be fixed.
I was expecting the system to crash roughly around the PID limit, but it actually crashed at about half that.

cat /proc/sys/kernel/pid_max
32768

ps -e | wc -l
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
16157

Edited June 24 by Hendrik112

JorgeB · June 24

Glad that you've found the issue, but just to confirm, were the flash drive logged issues also related to that?

Hendrik112 · June 24

11 minutes ago, JorgeB said:

Glad that you've found the issue, but just to confirm, were the flash drive logged issues also related to that?

Oh yes of course!

A system running out of PIDs will fail unpredictably. ENOFLASH7 and the memory allocation errors are all because of that.
With a real fork bomb, like the one in bash, the system will halt instantly.

Because this was happening very slowly, the system is able to kill processes and reassign IDs until there aren't any left.

The kernel probably killed whatever process is running the usb and license check. Not sure if the actual usb drivers would be killed for the kernel to survive, and I really don't want to find out.

Never ever fork bomb a system you still want to use. There is a high chance you will have massive problems down the line.

Edited June 24 by Hendrik112

JorgeB · June 24

50 minutes ago, Hendrik112 said:

Oh yes of course!

Thanks for confirming, I did find strange there there weren't any associated USB errors.

[6.12.10] Cannot fork: Resource temporarily unavailable

User Feedback

Recommended Comments

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

Hendrik112 0

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

itimpi 2,384

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Hendrik112 0

Link to comment

JorgeB 8,090

Link to comment

Join the conversation