CameraRick Posted November 29, 2021 Share Posted November 29, 2021 Hi there, I'm sorry I didn't find a better title I'm still in the Trial period of unRAID, it's a safe purchase for me (just waiting for some components). But I also mention this to emphasize that I'm a new user, not only with unRAID but Linux stuff in general. So, here's the situation. I was just coming home and my unRAID was terribly loud. All fans (controlled by the mainboard) blasting at full speed. I was not able to reach it over the WebUI on the network, my router showed it as disconnected. However, the ethernet port (normal Gigabit connection from the mainboard) was still blinking. No imagine from the HDMI (onboard graphics), but I also didn't start in GUI mode. I tried using the switch to safely boot it down through ACPI, no dice, no reaction. So I pushed and held the power button till the power went out. I left my appartement ~3h before and everything was fine. So it should have happened between 19:30 and 22:30h After rebooting the machine the sound was back to normal, inside the WebUI I saw the CPU was sitting at 60°C (but you know how much peak heat can vanish in a short amount of time). All HDDs were in good temp, as far as I can tell all shares still work (no data loss), parity check is now running (and will likely take 10h+) I only run few dockers, like a Plex server or jDownloader2, and a few others that don't run by default. Yesterday I installed my first and only VM, but I think I switched that one off in the evening (macOS Catalina through Macinabox). So, I panicked a bit. Now I'm asking myself if and how I could find out what happened? I opened the system log (not that I understand any of it), but it just starts at 22:30h when I booted the machine again :o Best, Rick Quote Link to comment
trurl Posted November 29, 2021 Share Posted November 29, 2021 attach diagnostics to your NEXT post in this thread. 2 minutes ago, CameraRick said: system log (not that I understand any of it), but it just starts at 22:30h when I booted the machine again syslog is in RAM like the rest of the OS. You have to setup syslog server to get that saved somewhere. Quote Link to comment
CameraRick Posted November 29, 2021 Author Share Posted November 29, 2021 (edited) Hi there trurl, thanks for your blazing fast reply! I attached the diagnostics and set the syslog to mirror on flash Best, Rick lunas-diagnostics-20211129-2329.zip Edited November 29, 2021 by CameraRick Quote Link to comment
trurl Posted November 29, 2021 Share Posted November 29, 2021 Nothing obvious in diagnostics. Have you done memtest? Quote Link to comment
CameraRick Posted November 29, 2021 Author Share Posted November 29, 2021 Hi trurl, I ran prime95 a week ago to test stability a bit, but memtest I didn't run for now at all. I saw it's available from the boot screen. I assume I should run it to see if the RAM is okay? Makes me wonder how that could send the CPU to go "all in", haha Best, Rick Quote Link to comment
trurl Posted November 30, 2021 Share Posted November 30, 2021 2 minutes ago, CameraRick said: see if the RAM is okay? Makes me wonder how that could send the CPU to go "all in" Everything goes through RAM, including the executable code of the OS. Maybe it is fine, but you don't want to run any computer with bad RAM. Quote Link to comment
CameraRick Posted November 30, 2021 Author Share Posted November 30, 2021 Hi trurl, ah, I see, makes sense! I think my new mainboard will arrive tomorrow. The machine unRAID is currently running on sat for some years in a corner, yet the mainboard is the only thing that bothers me. So aside of bad RAM (it's 32GB in four sticks), I could assume it could also be the mainboard or that it needs a clean - so I guess I will move hardware once it arrives and do the memtest right then, hopefully everything tomorrow evening Best, Rick Quote Link to comment
CameraRick Posted November 30, 2021 Author Share Posted November 30, 2021 So the new mainboard didn't come in, so I did Memtest on the old one. It has some errors - in the beginning it rand without any, then a low number, then around 20 and next time I looked it was 99. So I assume that is bad? Now I have to check every unit seperately to narrow down what's wrong, right? Best, Rick Quote Link to comment
JonathanM Posted November 30, 2021 Share Posted November 30, 2021 Any errors are bad. To complicate matters, memtest only captures obvious memory errors, so having a clean memtest doesn't mean memory is good, but having errors is always bad. Quote Link to comment
CameraRick Posted November 30, 2021 Author Share Posted November 30, 2021 Hi there Jonathan, so what would you suggest I do next? Best, Rick Quote Link to comment
trurl Posted November 30, 2021 Share Posted November 30, 2021 23 hours ago, CameraRick said: 32GB in four sticks Test each stick separately. If they all pass test each slot separately. Quote Link to comment
CameraRick Posted December 1, 2021 Author Share Posted December 1, 2021 Hi trurl, thanks again! The new mainboard arrived and I already checked two of the four DIMM; three passes, no errors, so far. I hooked up the thrid and will go to bed now. Hopefully I can check all four till tomorrow evening, if every single one is fine I will test all four again on this new mainboard Best, Rick Quote Link to comment
CameraRick Posted December 2, 2021 Author Share Posted December 2, 2021 Hi guys, the single one I checked over night had a single error after some passes. Then I put in the fourth one to check while I'm at work, that one had ten errors before I put my pants on. I got both of them out and will only use the two seemingly okay DIMM. For my usecase, 16GB should be enough anyway Unfortunately I can't setup unRAID properly with the new mainboard because I have no ethernet at my bench and the signal the GUI sends out seems to be incompatible with my 7" screen, so I will have to check everything else tomorrow after work. Thanks again for all your help Best, Rick Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.