Memory Log 100% Full after making new boot USB drive

MikaelTarquin · April 24, 2022

A few days ago I had a power outage. My ups allowed the server to gracefully shutdown, but then I was unable to bring it back up the next day. It turns out the USB drive was bad. I didn't have a flash back up, so I made a new USB and used the registration tool to reclaim my license. Everything seemed to go very smoothly, at first.

Today I was notified that my Ombi page isn't working, and sure enough I can't login either. In looking for possible causes, I noticed on my dashboard that my log is using 100% of its memory. I am unsure what is causing this, so I attached the diagnostics here. Would anyone be able to help me figure out why this is happening? Thank you!

nnc-diagnostics-20220424-0943.zip

Squid · April 24, 2022

Looks like the root problem here is that your file system on the cache drive is corrupted. This is caused a lot of times by memory being bad. Run a memtest.

(Side note, if you have no plans to upgrade to a multiple device pool, then usually you're better off using XFS as it's more forgiving on systems that are not 100% rock stable)

MikaelTarquin · April 24, 2022

Ok, is the best way to do a memtest from the boot menu, and let it run for a few days?

I replaced the cache drive very recently. It seems Plex and others are working, how best should I handle the corrupted file system?

EDIT: I saw this post from a few years back saying it's pointless to run memtest with ECC RAM. Is that true? My ram is ECC (Dell poweredge t630).

https://forums.unraid.net/topic/91204-how-to-run-memtest-headless/?do=findComment&comment=846406

Edited April 24, 2022 by MikaelTarquin

itimpi · April 24, 2022

I think the version of memtest you can download from memtest86.com can handle ECC RAM.

MikaelTarquin · April 25, 2022

Thanks! Is there a recommend way to run that on my unRAID server, or do I just need to run that on its own boot device?

3 hours ago, itimpi said:

I think the version of memtest you can download from memtest86.com can handle ECC RAM.

JonathanM · April 25, 2022

10 minutes ago, MikaelTarquin said:

run that on its own boot device

This.

JorgeB · April 25, 2022

Apr 23 05:08:42 NNC kernel: macvlan_broadcast+0x10e/0x13c [macvlan]
Apr 23 05:08:42 NNC kernel: macvlan_process_broadcast+0xf8/0x143 [macvlan]

If the server sometimes crashes see below.

Macvlan call traces are usually the result of having dockers with a custom IP address, upgrading to v6.10 and switching to ipvlan might fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enable, top right)), or see below for more info.

https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

See also here:

https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/

MikaelTarquin · June 14, 2022

I still need to run memtest, and update Unraid to v6.10, but have been busy with a move and unable to find the time. However, today I noticed my cache drive is throwing a SMART error again (Reallocated Sector Counts) This exact thing happened almost exactly 1 year ago, and I was unable to solve the problem then short of buying a new SSD. Needless to say, seeing an expensive 2TB SSD throw SMART errors after only 1 year and ~30TB of writes is extremely upsetting.

If it's related, during the move, I also discovered I was unable to boot my server (a Dell T630) until I moved a stick of RAM out of slot B1 (currently slots A1, A2, B2, and B3 are populated). Swapping other DIMMs didn't resolve the error, it was only when that slot was unpopulated that it got to BIOS.

Am I just screwed?

nnc-diagnostics-20220613-1909.zip nnc-smart-20220613-1918.zip

JorgeB · June 14, 2022

SMART test is failing so it should be replaced.

MikaelTarquin · June 14, 2022

But I JUST replaced it 13 months ago. This drive has 9000 hours of use, and only 30TB of writes. I don't want to just blindly replace expensive SSDs every 12 months if something in UNRAID or my system is killing them.

JorgeB · June 14, 2022

It can be Unraid killing the SSD, it could be if there was an unusual high amount of writes, but that's not the case.

Memory Log 100% Full after making new boot USB drive

Recommended Posts

MikaelTarquin

Link to comment

Squid

Link to comment

MikaelTarquin

Link to comment

itimpi

Link to comment

MikaelTarquin

Link to comment

JonathanM

Link to comment

JorgeB

Link to comment

MikaelTarquin

Link to comment

JorgeB

Link to comment

MikaelTarquin

Link to comment

JorgeB

Link to comment

Join the conversation