nrgbistro Posted December 22, 2022 Share Posted December 22, 2022 I recently replaced my Unraid server's CPU and motherboard and since this swap, I've had many issues with system stability. I think some of my drives are failing, but I can't tell which ones as the SMART reports all say that the drives are fine. I've currently running extended SMART on each drive but this may take a few hours to days. I've also had data loss in some of my applications, such as Sonarr and Radarr, but others seem unaffected. Additionally, I've tried connecting to the server via SFTP and when I try to download a file it says "unable to download X bytes, retrying..." I've also gotten "Bad Parameter" and "Server Error" when trying to start/remove Docker containers. Finally, when I tried to remove old docker containers through the CLI, I've gotten an error that essentially said "read-only filesystem". I've attached my diagnostics zip file. Let me know if any additional information is needed. Any support is greatly appreciated as I've been trying to solve these problems for a few weeks now! nrgserver-diagnostics-20221222-1442.zip Quote Link to comment
Solution BRiT Posted December 22, 2022 Solution Share Posted December 22, 2022 Since you adjusted your mb/cpu/ram, have you validated your system is stable enough with a few cycles of MemTest? Quote Link to comment
nrgbistro Posted December 22, 2022 Author Share Posted December 22, 2022 (edited) 1 hour ago, BRiT said: Since you adjusted your mb/cpu/ram, have you validated your system is stable enough with a few cycles of MemTest? How exactly do I run a memtest on unraid? The RAM currently installed is brand new. At first the system wouldn't boot but once I reseated each stick it seemed to work normally. I've also attached the most recent diagnostics zip file with the completed extended SMART test results. If anyone could point me toward which disk seems like a failure risk, that would be super helpful! One of the disks is brand new and I have another new one on the way to replace a potentially failing one. UPDATE: After restarting the system, my 3rd disk says this after formatting the entire disk (which took 6+ hours): Unmountable: Unsupported or no file system nrgserver-diagnostics-20221222-1819.zip Edited December 22, 2022 by nrgbistro Add more information Quote Link to comment
BRiT Posted December 23, 2022 Share Posted December 23, 2022 2 hours ago, nrgbistro said: How exactly do I run a memtest on unraid? The RAM currently installed is brand new. At first the system wouldn't boot but once I reseated each stick it seemed to work normally It's an option from the Boot Menu before it starts loading into unRaid. Incorrect bios settings can cause issues even if the physical memory is perfectly fine. You need to make sure the settings are correct and none of the XMPP or Memory Overclocking options are enabled. Quote Link to comment
nrgbistro Posted December 23, 2022 Author Share Posted December 23, 2022 (edited) I ended up creating a memtest USB drive and running it on my system. My BIOS settings were incorrect for my RAM (voltage was too low) and I thought resetting the BIOS would fix these errors but after repeating the memtest on all sticks I immediately saw more errors (2000+ in 30 seconds). Now, I'm running the memtest on each stick individually. So far I've had the first stick throw errors, the second stick had 0 errors, and the third is ongoing but so far so good. Does this mean I can rule out CPU, L1, and L2 cache errors? And inversely, can I conclude that the first stick is what was causing errors, assuming the other 3 end up with none? I plan on returning the faulty stick to amazon and getting a new one. Once I've installed it is there anything else you would recommend I check regarding system stability? Thank you for the suggestions so far, BTW! I never would have thought to check my memory... UPDATE: The rest of the sticks produced 0 errors individually as well as together. Using Unraid now seems much more stable and I haven't encountered anything suspicious so far. Will report back with problems if I find any in the near future. Edited December 23, 2022 by nrgbistro Quote Link to comment
nrgbistro Posted December 26, 2022 Author Share Posted December 26, 2022 Still having issues with the server, such as docker containers randomly stopping, server execution errors, and reverse proxy only partially working. I'm going to return this RAM and get a different brand and also update my motherboard BIOS. If that doesn't fix the problem I plan to attempt to recover anything important on the server as I currently only have a backup of my docker appdata folder and do a complete reset. Quote Link to comment
sage2050 Posted June 8, 2023 Share Posted June 8, 2023 Did new ram solve your issues? Quote Link to comment
nrgbistro Posted June 8, 2023 Author Share Posted June 8, 2023 Indeed it did. Can't believe I got faulty RAM brand new on amazon, but if anyone else has weird issues with their Unraid server make sure to check your RAM!! Quote Link to comment
sage2050 Posted June 8, 2023 Share Posted June 8, 2023 I memtested mine and didn't get any errors, but I'm running out of ideas. I got some budget ram (oloy) though, so it might just be a case of you get what you pay for. Quote Link to comment
nrgbistro Posted June 8, 2023 Author Share Posted June 8, 2023 9 minutes ago, sage2050 said: I memtested mine and didn't get any errors How long did you run the memtest for? I originally thought a few hours would be enough and didn't find anything, but eventually I left it for abt 24 hours and found many errors. It is possible that the RAM is somehow causing issues in some other way. Did you try to configure/reset your BIOS settings regarding RAM? Quote Link to comment
sage2050 Posted June 14, 2023 Share Posted June 14, 2023 I ran two passes, I didn't want to keep the server down for too long. BIOS is configured properly as far as i can tell. I'm just going to grab a new set and rma the old ones and see if my stability improves to preserve up time. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.