jsspanjer Posted August 30, 2023 Share Posted August 30, 2023 Hi. Unraid crashes since 6.12.3. Switched back to 6.12.2. Crashes stay What i did thusfar: I have followed advice here to look at my Docker Network configuration. Switched from maclan to iplan. I also disabled "Host access to custom networks" I set the advanced plex configuration to skip the health check. I looked at my go file to see if there was ipv6 settings. None exist. I switched om my syslog server to dump syslog on the cache. I see no relevant "crash" notifications. I see only in my "telegram bot" that Partity starts and after a while it is "cancelled". Server reboots and we are in a loop Currently i disabled docker and vm and cancelled the parity. I attached my logs. syslog.zip media-01-diagnostics-20230830-1703.zip Quote Link to comment
jsspanjer Posted August 30, 2023 Author Share Posted August 30, 2023 Last logs on the flash drive media-01-diagnostics-20230828-2031.zip media-01-diagnostics-20230829-1644.zip Quote Link to comment
jsspanjer Posted August 31, 2023 Author Share Posted August 31, 2023 After canceling parity and stopping docker and vm's it has a 18hr uptime as of now. I will keep docker and vm's offline. I will start parity check now and see what happens. Quote Link to comment
jsspanjer Posted August 31, 2023 Author Share Posted August 31, 2023 (edited) After 3 hours parity check the server reboots and we are in a parity check loop again. 😑 I have stopped the parity check again. I don't know what to do... Edited August 31, 2023 by jsspanjer Quote Link to comment
jsspanjer Posted August 31, 2023 Author Share Posted August 31, 2023 One thing to mention. I am not a new user. I work in IT and have used UNRAID since the "Limetech old days". My post count is low because I never used the forum to post. I always have used it as reference. Only reason for posting is that I need a pair of eyes and I really do not want corrupt data. Not everything is backed up. Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 29 minutes ago, jsspanjer said: After 3 hours parity check the server reboots and we are in a parity check loop again. 😑 This suggests a hardware issue, start by running memtest and/or using a different PSU. Quote Link to comment
jsspanjer Posted August 31, 2023 Author Share Posted August 31, 2023 (edited) 1 hour ago, JorgeB said: This suggests a hardware issue, start by running memtest and/or using a different PSU. But it is a strange coincidence that it started after installing 6.12.3 . Rollback to 6.12.2 did not solve it though. I get that parity intensifies operation of the server. There are more services running on the machine that still work if I just cancel Parity. I only stopped them for convenience. But. I will do a memtest. PSU is not something I have laying around. I could order one but that would be a shame if it was not related to the issue. Is there a way to see if this is PSU related? Just checked. PSU is from 2019 Corsair RM550x 550 Watt 80 PLUS Gold Fully Modular ATX PSU ( 10 Year Warranty) Edited August 31, 2023 by jsspanjer Quote Link to comment
JorgeB Posted August 31, 2023 Share Posted August 31, 2023 1 hour ago, jsspanjer said: Rollback to 6.12.2 did not solve it though. If it was the upgrade downgrading should have fixed it. Quote Link to comment
jsspanjer Posted August 31, 2023 Author Share Posted August 31, 2023 2 hours ago, JorgeB said: If it was the upgrade downgrading should have fixed it. Yes. Agreed. I am doing a memtest at the moment. Quote Link to comment
jsspanjer Posted September 4, 2023 Author Share Posted September 4, 2023 So. Did a 10 pass memory test and test my PSU with a PSU Tester. Nothing wrong with them. Everything tests fine. Als did an upgrade to 6.12.4 with the recommended configurations for docker. Nothing solved. Still a crash after i start Docker and VM's. I have custom Docker networks on. eth0 and vlan configuration. Also reformatted my cache drives to zfs. Quote Link to comment
JonathanM Posted September 4, 2023 Share Posted September 4, 2023 4 hours ago, jsspanjer said: test my PSU with a PSU Tester. That PSU tester (I have the exact same one) only tests at idle current, it does not put any load on the supply. Similar to the memtest, it only can confirm a failure with a negative result, a pass by either memtest or that style of PSU tester just means it passed under certain conditions, not all conditions. Quote Link to comment
jsspanjer Posted September 5, 2023 Author Share Posted September 5, 2023 (edited) 10 hours ago, JonathanM said: That PSU tester (I have the exact same one) only tests at idle current, it does not put any load on the supply. Similar to the memtest, it only can confirm a failure with a negative result, a pass by either memtest or that style of PSU tester just means it passed under certain conditions, not all conditions. Hi. I get that. But. I have traversed this forum for years. I have never asked a question here. Every time someone posts a problem here, there's always someone who asks for syslog and diagnostics. I only see the syslog downloaded once. So it is not looked at. I myself am in the IT troubleshooting business. I am doing Windows. Unraid is a hobby. When something like this happens, I always look at wat was changed. I had an uptime of hundreds of hours before I updated to 6.12.3. Immediately after reboot the crashes happen. Mind you. It still could be hardware related. But it happened directly after installing 6.12.3. After that the crashes stayed. Degrading to 6.12.2 did not fix it. Upgrading to 6.12.4 did not fix it. I did a couple of memtest days (as asked) and tested my PSU (as asked). But still see 1 download of the syslog. 0 downloads of the diagnostics files I am not ruling out hardware but I am not Unraid minded enough to see if there's anything wrong in the syslog / diagnostics. Could please someone rule them out before I am buying new PSU? Or if anyone is kind enough to show me (a manual) how to read them. Edited September 5, 2023 by jsspanjer Quote Link to comment
SP67 Posted September 5, 2023 Share Posted September 5, 2023 You could try pushing power consumption to the max to see if PSU fails. For example, prime95 can be installed from nerd pack. If you have a GPU, there might be a similar tool for it. Quote Link to comment
jsspanjer Posted September 5, 2023 Author Share Posted September 5, 2023 15 minutes ago, SP67 said: You could try pushing power consumption to the max to see if PSU fails. For example, prime95 can be installed from nerd pack. If you have a GPU, there might be a similar tool for it. Good idea. Thank you. Quote Link to comment
jsspanjer Posted September 5, 2023 Author Share Posted September 5, 2023 If I insert my Power calculations into https://outervision.com/power-supply-calculator I get a Load Wattage: 543 W So. That means that my current PSU is not capable of running that kind of max power. I currently have a Corsair RM550x 550 Watt 80 PLUS Gold Fully Modular ATX PSU. Hmmm. Quote Link to comment
itimpi Posted September 5, 2023 Share Posted September 5, 2023 6 hours ago, jsspanjer said: I have never asked a question here. Every time someone posts a problem here, there's always someone who asks for syslog and diagnostics. I only see the syslog downloaded once. So it is not looked at. Not quite sure what point you are trying to make? The diagnostics includes the current syslog from RAM. The only time I would expect to see a need to post a syslog separately is when it is one resulting from the syslog server. Quote Link to comment
SP67 Posted September 5, 2023 Share Posted September 5, 2023 9 hours ago, jsspanjer said: If I insert my Power calculations into https://outervision.com/power-supply-calculator I get a Load Wattage: 543 W So. That means that my current PSU is not capable of running that kind of max power. I currently have a Corsair RM550x 550 Watt 80 PLUS Gold Fully Modular ATX PSU. Hmmm. Take a look at your motherboard settings to find a way to limit the TDP of the CPU to 65W (disable one ccx and enable eco mode for example or look online). It will reduce performance but decrease power consumption by quite a bit, so you can stay bellow your PSU max wattage. If you have a kill a watt or similar you can see what the system is actually pulling and see if you exceed 500W. Quote Link to comment
jsspanjer Posted September 6, 2023 Author Share Posted September 6, 2023 17 hours ago, itimpi said: Not quite sure what point you are trying to make? If I did not provide the logs everybody would want me to provide them. Now I provided the logs and nobody seems to do anything with them. There was only one download. Quote Link to comment
jsspanjer Posted September 6, 2023 Author Share Posted September 6, 2023 14 hours ago, SP67 said: Take a look at your motherboard settings to find a way to limit the TDP of the CPU to 65W (disable one ccx and enable eco mode for example or look online). It will reduce performance but decrease power consumption by quite a bit, so you can stay bellow your PSU max wattage. If you have a kill a watt or similar you can see what the system is actually pulling and see if you exceed 500W. I have ordered a new power supply. This time I bought the Corsair RM1200 shift. That's mostly overkill but will make sure that there's enough power. Hopefully that will solve the issue. 1 Quote Link to comment
jsspanjer Posted September 8, 2023 Author Share Posted September 8, 2023 Installed the new power supply. RM1200 (1200W). Came to realize that the "old" power supply was 850W. It was the RM850i. So should have been sufficient. Booted the server up. Parity check started again at sep 7, 19:20 hours. Looked this morning and the server rebooted again at 4:20 ish. I am not seeing anything strange in syslog. Could someone please assist in looking at it? Latest image is from my telegram-bot notification. One thing to mention. Disk 1 is normally not used and is seen overheating when parity occurs. 51 degrees and drops again to 49 degrees. I have many more of the same kind. They are close together but only this one is 3 degrees warmer than the rest. Normally I do not use this disk. It was empty. I started doing Time Machine backups to it a couple of weeks ago. This morning, after I saw that the server still reboots, I disabled that and removed all files on it. syslog.zip media-01-diagnostics-20230904-0922.zip Quote Link to comment
JorgeB Posted September 8, 2023 Share Posted September 8, 2023 There's nothing relevant logged, that and that fact that the server rebooted on its own suggests a hardware problem, if it's not the PSU it can be another component, like RAM, board, etc. Quote Link to comment
jsspanjer Posted September 8, 2023 Author Share Posted September 8, 2023 4 minutes ago, JorgeB said: There's nothing relevant logged, that and that fact that the server rebooted on its own suggests a hardware problem, if it's not the PSU it can be another component, like RAM, board, etc. Thank you for looking at the log. Will let you know if I find some other "culprit". Quote Link to comment
JorgeB Posted September 8, 2023 Share Posted September 8, 2023 Since you have 4 RAM sticks, I would start by running with just two (in dual channel mode), if the same try the other two, that would basically rule out RAM issues, or an unstable board with all four sticks in use. Quote Link to comment
itimpi Posted September 8, 2023 Share Posted September 8, 2023 It would also be worth checking that all fans are spinning OK as the reboot could be due to CPU overheating causing a thermal shutdown. Quote Link to comment
jsspanjer Posted September 10, 2023 Author Share Posted September 10, 2023 So. I have an extra usb drive (backup) which i erased. Installed fresh version of Unraid 6.12.4 on it. Only config i copied over are the minimum settings to do a parity sync to see if it stays up. From latest flash backup: /config/pools /config/shares /config/disk.cfg /config/share.cfg super.dat copied my second pro.key to it. Parity succeeded. Uptime 19 hours and counting. As far as i see it. No hardware issue. The original "master" usb is an upgraded Unraid installation from 2019. I will slowly configure this new usb to see if the server will continue to function. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.