Gumdomike Posted April 20, 2020 Share Posted April 20, 2020 My server is rebooting frequently, especially when I try to write data to the array. I was able to capture a video of a crash the other day and posted it here. Any guidance would be greatly appreciated. The server is pretty much unusable since it crashes whenever I try to write data to the array. I thought maybe it was a heat issue with the processor so I just reapplied new thermal paste prior to capturing the video. That obviously was not the issue/solution. I have attached the syslog and diagnostic zips. As a troubleshooting step, I wrote the syslog to the flash drive to see if I could capture any errors there. I did not see anything but I'm including it too. System Overview Unraid system:Unraid server Basic, version 6.8.3 Model:Custom Motherboard:Supermicro - X8SIL Processor:Intel® Xeon® CPU X3470 @ 2.93GHz HVM:Enabled IOMMU:Enabled Cache:L1-Cache = 256 kB (max. capacity 256 kB) L2-Cache = 1024 kB (max. capacity 1024 kB) L3-Cache = 8192 kB (max. capacity 8192 kB) Memory:32 GB (max. installable capacity 32 GB)* DIMM1A = 8192 MB, 800 MT/s DIMM1B = 8192 MB, 800 MT/s DIMM2A = 8192 MB, 800 MT/s DIMM2B = 8192 MB, 800 MT/s Network:bond0: fault-tolerance (active-backup), mtu 1500 eth0: 1000Mb/s, full duplex, mtu 15 00 eth1: not connected Kernel:Linux 4.19.107-Unraid x86_64 OpenSSL:1.1.1d P + Q algorithm:7335 MB/s + 10054 MB/s Uptime:0 days, 0 hours, 3 minutes, 42 seconds tower-syslog-20200420-0707.zip tower-diagnostics-20200420-0305.zip syslog (3) Quote Link to comment
JorgeB Posted April 20, 2020 Share Posted April 20, 2020 That looks like a hardware problem, try with only a couple of DIMMs at a time, using a single channel, also try a different CPU if available. Quote Link to comment
Gumdomike Posted April 20, 2020 Author Share Posted April 20, 2020 I had run a memtest on the box for about 35 hours and everything came back fine. If that passed could it still be a RAM issue? Unfortunately I don't have another CPU to swap out. I've been going back and forth about ordering another one on ebay, but I don't want to spend the money if that is not definitely the issue. Quote Link to comment
JorgeB Posted April 20, 2020 Share Posted April 20, 2020 10 minutes ago, Gumdomike said: If that passed could it still be a RAM issue? Yes, memtest doesn't detect every type of error, though one that happens as frequently as you say should be detected. Try a single RAM channel like I mentioned, in case the CPU has issues with one of them. Quote Link to comment
Gumdomike Posted April 23, 2020 Author Share Posted April 23, 2020 I took your advice and removed all of the RAM and put one back in. The system booted up fine and allowed me to transfer ~20 gigs of files across the network. This would usually choke on < 1 gig files. This was promising so I pushed my luck and added in another stick. I'm currently half way through moving ~300 gigs worth of data and it is just chugging along. I had 4 DIMMs in there originally. I'm not sure if I want to try to stick another back in just yet, or be content with 16 GB vs the original 32. Thank you for all of your help. This has been very disheartening for quite awhile now since I wasn't able to reliably use my server. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.