April 12, 20251 yr Link to previous issue where my cache would not mount so I wiped and restarted: After a few days of working with no issues, it appears that around April 10th/11th in the logs, I'm getting...I don't even know, just a bunch of faults/error codes that are indecipherable to me. This is the issue I was having before the issue that I linked above which eventually caused my cache to be unmountable. Typically I restart my server, fix any data errors in the cache and then keep going. Happens every couple days. The two logs I attached are the downloaded syslogs from after i rebooted the server (`megaplex` prefix), and the syslogslogs that I backup which show what happened prior to reboot (no prefix). I've also attached diagnostics. Please let me know if there's anything else I can upload. Not sure what's going on! syslog-192.168.0.10.log megaplex-syslog-20250412-1608.zip megaplex-diagnostics-20250412-1014.zip
April 13, 20251 yr Community Expert Solution Also recommend running memtest, since zfs is detecting data corruption.
April 13, 20251 yr Author Thank you both for the support & recommendations. I ran memtest first since that was the easiest thing to try...lots of errors. Results are tens of thousands of errors over multiple tests in all 32GB of RAM. Does this just mean I need new sticks (along with the Ryzen system recommendations)? And if so, are there any I should lean towards or away from given my current system specs? My understanding is that I would not be able to go with ECC RAM given my Ryzen cpu. edit: Didn't realize memtest is still running and runs multiple times. At ~400k errors so far, will update one more time once its done. Edited April 13, 20251 yr by ryantomlinson95
April 14, 20251 yr Community Expert 4 hours ago, ryantomlinson95 said: ran memtest first since that was the easiest thing to try...lots of errors ... At ~400k errors so far, will update one more time once its done No point in running it more just to get more errors. Even 1 error is too many. EVERYTHING goes through RAM. The OS and other executable code, YOUR DATA. 16 hours ago, JorgeB said: zfs is detecting data corruption The CPU can't do anything with anything until it is loaded into RAM. You shouldn't even attempt to run any computer unless memory is working perfectly.
April 14, 20251 yr Author Ok got it, makes sense! I guess I'm wondering if buying new sticks is the first next step I should take? Sounds like I need to also get all my BIOS settings correct so I don't cause more issues with the new sticks.
April 14, 20251 yr Author Yeah I just checked my BIOS settings and they all look correct according to the FAQ post you linked, namely Power Supply Idle Control -> Typical Current Idle, and memory frequency at DDR4-3200MHz.
April 14, 20251 yr Author Thanks for that! One of my two sticks boots and passes memtest with no errors. The other one I can't get past bios so I assume its bad in some way, even though its recognized in bios. Should I try running my server for a few days with just the single stick of RAM or should I try and replace it first?
April 14, 20251 yr Community Expert 41 minutes ago, ryantomlinson95 said: Should I try running my server for a few days with just the single stick of RAM You can do that, unlikely that both are bad, you will need to scrub the pool, and look for any corrupt files, these should be deleted/restored from a backup, then keep monitoring the pool for new corruptions: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#findComment-700582
April 15, 20251 yr Author Thanks! Up and running now with a single stick of ram and a second one on the way. Pools scrubbed and no data errors, recreated all my containers from backups, and monitoring for new corruptions. Appreciate all the help 🙌
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.