• Posts

  • Joined

  • Last visited

randomjohn's Achievements


Newbie (1/14)



  1. I guess we can consider this one closed but not solved. While it ran properly before a BIOS update came out recently for my motherboard, I installed that, then I upgraded to Unraid test build (and now RC-1) from 6.8 stable. I significantly increased the overall system cooling to where the NVME drive is now about 10C cooler at idle in reaction to some temp warnings I was getting. Mobo and CPU are about 5C cooler (~29C at idle) Finally, I spent a lot of time tracking down NTP errors and switched out some of the default google servers for the NTP pool for my region. I really hope I'm not jinxing anything, but I was at 4 days of error-free uptime when I posted this. Hopefully something in here helps the next poor soul.
  2. That's what I was afraid of. Any recommendations for good diagnostics? Weird that it would start after two months. Usually it's right away or after a long time. I did add and move fans, which significantly dropped internal temps but maybe that was too late. I've attached the most-recent diagnostics, but I don't really know what I'm looking for and nothing leapt out at me. Thanks for all your help. tower-diagnostics-20201206-2253.zip
  3. Any of this look interesting or relevant? Still trying to figure out what's going on with the ntp server. Dec 6 17:56:21 Tower ntpd[30258]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized Dec 6 17:56:21 Tower ntpd[30258]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized Dec 6 18:02:00 Tower ntpd[30258]: kernel reports TIME_ERROR: 0x2041: Clock Unsynchronized Dec 6 18:13:37 Tower kernel: mce: [Hardware Error]: Machine check events logged Dec 6 18:19:58 Tower kernel: mce: [Hardware Error]: Machine check events logged Dec 6 18:25:50 Tower kernel: python3[9144]: segfault at 689b401a4b3f ip 000014ebe5c85646 sp 000014ebd31f7160 error 4 in libpython3.8.so.1.0[14ebe5bc0000+1e7000] Dec 6 18:25:50 Tower kernel: Code: f8 41 89 f1 42 ff 24 f1 0f 1f 40 00 4c 89 ee 4c 89 3e 4c 8d 6e 08 4d 85 ff 0f 84 62 e8 ff ff 44 8b b3 50 02 00 00 45 85 f6 0f <85> 6c b7 ff ff 44 8b 9b 4c 02 00 00 4c 89 d7 48 2b 3c 24 45 85 db
  4. This worked for me, although it's showing different temperature (3C difference) between Dashboard and Main.
  5. Only in the hope that it helps, not to pester, I'm having the same issue in Version 6.8.3 2020-03-05. I can see the temperature for my nvme disk on the tower/Dashboard page in the "Unassigned" widget (bottom right for me), but not on tower/Main in the Unassigned Devices section. Just get an asterisk there. I'm fairly confident that it happened on the update today, because I just installed a new fan earlier this afternoon and have been paying a lot of attention to the temperature in the hope of solving an entirely different problem. So I saw (and was watching) the temperature right before the update on the Main page. Hope that helps. Thank you.
  6. I ran both Memtest86+ 5.01 (thanks for your heads up to boot in legacy mode) and MemTest 86 v0.4. They each ran for at least four passes and about 20 additional hours and returned no errors. I've been running error-free in Safe Mode for ~24 hours now. Will be rather annoyed (but relieved) if it's the same problem @Wingold refers to - at least I'll know, but a significant part of the hardware purchase and Unraid installation was to up the horsepower for Plex streaming.
  7. OK, I'm trying to sift through the syslog now. I did waste a lot of space with csrf_token errors from having browser sessions open across a reboot. This is the only thing I've found that looks like it might be on point and I can't make heads or tails of it: Dec 1 12:47:07 Tower root: Fix Common Problems: Error: Machine Check Events detected on your server Dec 1 12:47:07 Tower root: Hardware event. This is not a software error. Dec 1 12:47:07 Tower root: MCE 0 Dec 1 12:47:07 Tower root: CPU 0 BANK 0 TSC 23a4c0e61fe6 Dec 1 12:47:07 Tower root: ADDR 1ffff8107baa3 Dec 1 12:47:07 Tower root: TIME 1606800346 Tue Dec 1 00:25:46 2020 Dec 1 12:47:07 Tower root: MCG status: Dec 1 12:47:07 Tower root: MCi status: Dec 1 12:47:07 Tower root: Corrected error Dec 1 12:47:07 Tower root: Error enabled Dec 1 12:47:07 Tower root: MCi_ADDR register valid Dec 1 12:47:07 Tower root: MCA: Instruction CACHE Level-0 Instruction-Fetch Error Dec 1 12:47:07 Tower root: STATUS 9400004000040150 MCGSTATUS 0 Dec 1 12:47:07 Tower root: MCGCAP c0e APICID 0 SOCKETID 0 Dec 1 12:47:07 Tower root: MICROCODE d6 Dec 1 12:47:07 Tower root: CPUID Vendor Intel Family 6 Model 158
  8. OK thanks for that. I'm now getting an MCE warning from fix common problems. Any chance you see something in the attached, or is there a specific MCE area? (The machine is headless and I haven't moved it back to a monitor yet so unless there's an option to boot to safe mode from the GUI, I haven't been able to do that yet - I stupidly put it away after running memtest) tower-diagnostics-20201201-1205.zip
  9. Not sure if the syslog worked, but I've attached it. MemTest looks good - although I couldn't run it from the Unraid boot - I had to go to the website and create a new USB with just that on it to run MemTest. Could that be part of the problem? MemTest86-Report-20201129-111840.html syslog
  10. v6.8.3 (registered) Plugins: Fix common problems, unassigned devices, CA Backup / Restore Appdata, Community Applications, Dynamix S3 Sleep, Dynamix SSD TRIM, Dynamix System Buttons, Dynamix System Information, Dynamix System Statistics, Dynamix System Temperature, Nerd Tools, Unassigned Devices Plus Dockers: binhex-sabnzbd, plexinc plex media server, linuxserver radarr/sonarr/lidarr Hardware: Model: Custom M/B: ASUSTeK COMPUTER INC. PRIME Z390-A Version Rev 1.xx - s/n: 200873513103219 BIOS: American Megatrends Inc. Version 1602. Dated: 06/04/2020 CPU: Intel® Core™ i9-9900K CPU @ 3.60GHz HVM: Enabled IOMMU: Enabled Cache: 512 KiB, 2048 KiB, 16384 KiB Memory: 32 GiB DDR4 (max. installable capacity 64 GiB) Network: bond0: fault-tolerance (active-backup), mtu 1500 eth0: 1000 Mbps, full duplex, mtu 1500 Kernel: Linux 4.19.107-Unraid x86_64 Diagnostics attached. Thanks in advance for your help. tower-diagnostics-20201128-1928.zip
  11. That's very helpful, thank you. The CPU overkill is definitely part of a possibility that I'll run a Windows VM at some point, and a misguided notion that the stronger CPU will help with transcoding (which is why I'm going with the integrated gfx). The PSU was not nearly even that well thought out. Previous PCs were more for gaming and had separate gfx cards with ridiculous power requirements, so I just got in the habit of buying more than I need. I figured it couldn't hurt if I eventually wanted to power a lot of drives, but your point means maybe I just stumbled into a compelling reason anyway.
  12. I'm planning to build up the unraid storage slowly over time (I'm looking at you, Black Friday deals) because I've still got 2 TB left on a WD MyCloud EX4. I may or may not keep those in the case or move the drives to this server now. But it's just working, so no urgency. I'm replacing a dying Windows 7-based Plex Media Server/sabnzbd/radarr/sonarr/lidarr setup. I've run linux before and want to be able to use dockers to fool around with a bunch of automation stuff. Almost all of the streaming is to FireTVs within the house or when I'm traveling for work. Rarely will there be multiple users. Is this overkill? PCPartPicker Part List CPU: Intel Core i9-9900K (Standard Folding Box) 3.6 GHz 8-Core Processor ($392.89 @ B&H) CPU Cooler: Cooler Master Hyper 212 EVO 82.9 CFM Sleeve Bearing CPU Cooler ($34.99 @ Amazon) Motherboard: Asus PRIME Z390-A ATX LGA1151 Motherboard ($167.89 @ Amazon) Memory: G.Skill Ripjaws V Series 32 GB (2 x 16 GB) DDR4-3200 CL16 Memory ($109.99 @ Newegg) Storage: Samsung 970 Evo 1 TB M.2-2280 NVME Solid State Drive ($149.99 @ Amazon) Case: Fractal Design Define R5 ATX Mid Tower Case ($123.99 @ Amazon) Power Supply: Corsair RM (2019) 750 W 80+ Gold Certified Fully Modular ATX Power Supply ($124.99 @ Corsair) Total: $1104.73 Prices include shipping, taxes, and discounts when available Generated by PCPartPicker 2020-10-25 13:57 EDT-0400 | Generated by [PCPartPicker](https://pcpartpicker.com) 2020-10-24 19:12 EDT-0400 | I'm going to shuck a couple of WD 12TB Elements for starters. Appreciate your attention and comments. Thank you