Jump to content

Greeno237

Members
  • Posts

    10
  • Joined

  • Last visited

Greeno237's Achievements

Noob

Noob (1/14)

1

Reputation

  1. 30 days stable. I think it's safe to say my problems were a result of my damaged hardware. Thank you so very much to my troubleshooters trurl and Dissones! I never would've figured this out on my own, I had no idea where to start. I appreciate you taking the time to help.
  2. Ok, so I did some digging on this, and apparently the large numbers are due to the way Seagate has the data encoded, large numbers in the raw values are normal. Source: http://www.users.on.net/~fzabkar/HDD/Seagate_SER_RRER_HEC.html The ".../usb3/3-8/..." part there seems to be some type of naming scheme for the motherboard. If I switch the boot device back to the 3.0 slot, it actually calls it "usb4/4-x" and if I switch it to the other 2.0 port on the front I/O Panel it changes the second digit of 3-8. I'm hoping at this point that all my troubles were from the damaged socket. Like I said previously, I will keep this updated and note it as solved if everything runs smoothly now.
  3. Sorry for the radio silence, it's been a busy few weeks. Took me some digging but I think I found the problem. My socket had bent pins! I've just replaced the motherboard today, got the server up and running, I will report back on the stability of the system moving forward. I'm not sure what was causing the errors that Dissones found. I'll take a look at the numbers now.
  4. Oh I must be dense. Sorry, of course I can make a new one, I was just grabbing the one off the flash drive, as you said. server237-diagnostics-20200311-1335.zip I realized power issues could include loose cables, so I double checked the power cords and all of them seem snug. I could try replacing the cord that plugs the PSU into the surge protector, I believe I have some spares around. 10 Full cycles of memtest over 18.5 hours and no errors.
  5. Ok, I think I get the structure of those two files, as I can see that the syslog is updated constantly while the diagnostics hasn't changed since Feb 23. I will continue to upload the syslog to pass along the most recent info as we progress with troubleshooting. I'm fairly sure that cooling and power are reliable. This hardware used to be my main rig, I was running the same hardware with different SSD/HDDs plus a GTX1080. I was slightly overclocking the CPU to 4.7GHz from 4.4, on air using a BeQuiet Dark Rock 3 which has a 250W TDP rating. Under stress test I was hitting high 80s or maybe 90 degrees but it was stable when running Win 10. The PSU is a CM600X Corsair from 2014. It is 80+ Bronze certified, though it certainly isn't top of the line, and it is 5-6 years old now. It performed well while overclocking the old rig, and I'm not asking nearly as much from it in the current setup. I have 3 fans moving air in the current Full ATX case, and it was recently dusted. CPU temp in memtest is currently bouncing between 55-60, I have no frame of reference, but that seems normal for a light load. I think it was low 50s when running the single core test. I don't remember noticing high temps when stressing the server (running a W10 VM and 1 Plex stream is probably the hardest I pushed it since setup), but I can't say for certain. That's all the relevant info I can think of re: power and cooling, but I'll definitely keep a closer eye on temperature data moving forward.
  6. Thanks for the answers, very helpful! In previous server crashes I was also unable to do anything from a direct connection with monitor and a USB keyboard. I ran memtest for about 6 hours, got through 4 complete cycles (pass) with 0 errors, so I started up the server again yesterday. It ran for about 23 hours with no problems and then just rebooted all by itself while I was sitting here. This is not the same behaviour as the previous crashes, as before it would hard lock and I wouldn't be able to do anything. This time it rebooted itself and started unraid at the command line. (Eventually I would like to run headless, but since I've been troubleshooting, I have a monitor/keyboard plugged in. I'm using a wireless USB keyboard mouse combo.) At this time, I've restarted the memtest, choosing the multi-core option this time instead of the default. Should I be using multi-core? Seems like that is closer to a real world test? I'm going to let this one run for at least 24 hours, but the 6 hrs of clean memtest had me leaning towards the boot drive being plugged into USB3 being the source of the problem. Maybe it is/was? Maybe I have found a new problem? Tough to tell with a slightly different symptom. Uploading the current syslog from the flash drive, here it is: Greeno237 syslog.zip This unplanned restart happened at Mar 10 18:48:xx in the log. Just for clarity, the boot USB and the keyboard/mouse dongle are plugged into USB 2.0 slots now. No other USB devices are plugged in. If there is anything else I should be looking for beyond errors in the memtest, please let me know.
  7. And I suppose I should've included the obvious question. Is it possible that using the boot drive in the USB 3.0 slot was the source of my crashes?
  8. I was definitely using it in a 3.0 slot. Thanks for letting me know that 2.0 is better, I obviously missed that. I've swapped it over now and restarted the memtest. Is this diagnostics file the one that is better to upload? Not the syslog? Or both? Here's the 'diagnostics' one, I felt it was less relevant since all the events are from a few weeks ago. This file and the one from my original post are the two zip files which I found on the flash drive after enabling mirror of the syslog to the flash drive to try to see what is going on. When this box locks up I have no access through GUI or SSH, and I can't even ping it. server237-diagnostics-20200223-2209.zip
  9. Ok, was getting an error at first trying to launch memtest86, had to make sure I was booting a non-UEFI instance of the USB drive, now memtest is running. Just start it up and let it run for 24 hours/until I get an error?
  10. Hi Everyone, Unraid newb here, trying to figure out why my server keeps crashing. When I first setup the box it ran for over a week with no problems, but now it is crashing all the time. Usually it stays up long enough to finish the parity check (~20hrs) but not always. I enabled the syslog mirroring to the flash drive, and this is the file that I got. I have no idea what to look for. When the server crashes I have no access through the webGUI, nor through SSH, and I can't even ping the server. Hardware: i7 4790k 16GB HyperX DDR3-1866 RAM Z97 MSI Gaming 5 Motherboard What is going on and how do I start to track down the issue? 19:04:42 is when I rebooted the server after yet another unclean shutdown. Thanks! edited: added hardware, off to try a memtest now. syslog.zip
×
×
  • Create New...