p3rky2005 Posted July 10, 2022 Share Posted July 10, 2022 Hi everyone, so im propperly at a loss here, i built my unraid box when borris locked us down following spaceinvader ones guides, absolutly brilliant, im a tech guy anyway so love this but linux & containers, docker and the like are quite new to me, ive worked with VM's before but im a windows guy. with that said lets get down to the problem and how it started, this machine ran perfectly for around 2 years with zero downtime as such, ive been running Swag Nextcloud, MariaDB Plex Emby Sonarr Radarr SabNZB Shinobi Youtube DL and probably a few more i cant remember now, as well as a Windows 10 VM the issues im having are: within the VM Chrome will show the SNAP page stating memory access violation, my Docker containers crash i.e. Emby will usually stay logged in untill it crashes then it askes me for a username and password and will not accept my details untill i restart the container, having said that i get code 403 when trying to restart the container and this is only resolved by a system reboot. system is a Ryzen 3600x 24gb Ram 4x 6tb HDDs (1 of them for parity) 1x 2tb WD Blue NVME SSD ( houses my containers / App Data and VM / Domains folders ) also i was using it as a cache drive, i did wonder if this was a SSD Failure so the 2tb NVME is New i was running a 1TB before. anyway here when the issue started, as stated above ive been following spaceinvader ones guide (what a bloke for these guides) and i installed Tdarr and started to transcode my sizable plex library into H.265, all was going well and i managed to convert about half if not 2/3 of my library, then i added a gpu to speed things up as i was just allowing the cpu to chug through it when i wasnt using the machine as the helm as such, this is where the problem started, i filled up my cache drive and thus crashed my machine, got back rebooted realised what happened though oh silly me ran the mover and thought all was good. my machines never been stable since, ive now removed the GPU for reference it was GTX1080 3gb, and im still using the RTX3070 i has in for the windows VM, ive attached the diagnostics for you to take a look through, i have 2 unhappy ish drives but they were thumbs down when i started the transcode, 1 of them has simply hit its lifetime timer but i cant see these giving me all this trouble when the stuff im running mainly sits on the SSD??? any help would be greatly appreciated and if ive forgotten anything shout me and ill add details as we need them, id love to get this stable again as its so useful. thankyou everyone in advance, cant wait to see what you all think voyager-diagnostics-20220710-1249.zip Quote Link to comment
Squid Posted July 10, 2022 Share Posted July 10, 2022 Assuming that you've already handled this: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#comment-819173 You definitely want to run a memtest from the Boot Menu (if you're booting via UEFI, you will need to temporarily switch to Legacy mode for Memtest to work or alternatively create a new bootstick via https://www.memtest86.com/ Additionally, if you're going to mix and match memory (which you are), the most stable system is one which Has all matching sticks Has matching sticks installed in pairs (this also has a massive performance boost) If unable to match sticks, then you must ensure the the CL timing is identical between the sticks Quote Link to comment
MAM59 Posted July 10, 2022 Share Posted July 10, 2022 make sure you run a very intensive memtest (build your own stick, the UNRAID version is outdated, not UEFI and does not recognize modern Ryzen CPUs). Many Ryzen Boards have memory problems, even if you have bought "3200 approved" memory this does not mean that you are also be able to use this speed. The more slots are occupied, the slower the speed has to be. Pushing the voltage above 1,35V is dangerous too, it will kill your memories within a few years (wearing them out). I, for instance, get "only" 3000 with 1,35V out of my "3200 certified" 128Gb bars. It took me a week of testing to come to this value (start with 3200, cancel if error shows up, go down to 2400, all well, raise to 2800 and so on...) One run can take the whole night depending on the amount of memory and the number of cores of your processor. But stay cool! do it until you have got an absolutely rock solid combination. Else randome errors like those you have now will drive you crazy. ANYTHING may happen with bad ram... Quote Link to comment
p3rky2005 Posted July 12, 2022 Author Share Posted July 12, 2022 Hi Guys, And thabkyou so much for your replys. So due to family issues I've only just been able to run mem test through my server and I'm ashamed to say as I used to be a computer engineer it looks very much like it was memory Squid thanks so much looks like my corsair vengeance sticks both have died, the one stick just instantly threw errors and hundreds of them. So I popped that out used the same slot and run the second stick that only threw 1 or 2 errors but still considered dead in my mind. Now I'm down to 1x 8gb adana stick not idea when I need to run a vm but at least I know the issue now. Can't thank you enough, I may come back to you shortly with some other questions as now I need to repair my plex docker contain database as it keeps telling me its corrupt. And is there a way within unraid I can scan for corrupted data? Thankyou all so much Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.