pfields

Members
  • Posts

    11
  • Joined

  • Last visited

pfields's Achievements

Noob

Noob (1/14)

0

Reputation

1

Community Answers

  1. I was still suffering from crashes after changing the USB drive. In the end I changed the Ryzen 7 1700 for a Ryzen 5 2600 and haven't had a crash in over 19 hours. Seems like Unraid is still not playing well with first gen Ryzen even with the BIOS options set.
  2. I turned off Syslog mirroring this morning and opted to use a remote syslog server instead and haven't had a crash since. This leads me to believe that ordering a new USB key is a good bet in the short term.
  3. As a precaution, I just purchased a new USB drive to make sure the Kernel panics aren't from the USB. Can anyone make anything of the kernel panic below?
  4. I set XMP Profile 1 this morning to 2400Mhz but has just crashed. There are always a few lines in Sylog before it also goes dead. Dec 15 08:21:04 Tower nmbd[2659]: [2021/12/15 08:21:04.787064, 0] ../../source3/nmbd/nmbd_become_lmb.c:397(become_local_master_stage2) Dec 15 08:21:04 Tower nmbd[2659]: ***** Dec 15 08:21:04 Tower nmbd[2659]: Dec 15 08:21:04 Tower nmbd[2659]: Samba name server TOWER is now a local master browser for workgroup WORKGROUP on subnet 172.17.0.1 Dec 15 08:21:04 Tower nmbd[2659]: Dec 15 08:21:04 Tower nmbd[2659]: ***** Dec 15 08:25:54 Tower ntpd[2029]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized I assume this has nothing to do with time settings in unraid because NTP within UNraid shows the right time and I can communicate correctly with time1.google.com.
  5. The server crashed again after about 1h30, black screen on the physical monitor. Nothing responding. Can anyone decipher the following and also why am I getting these Clock Unsynchronized errors?
  6. Ok i've worked out the issue with Gigabyte for the 'Typical Current Idle', there is two ways to access the option in the BIOS and one way resets and the other sticks. The server doesn't seem to crash like before, but I keep getting restarts, I will randomly log in and see the Uptime at 5 mins even though the server has been on for an hour for example. The only errors or warnings I can see in the log when I check is the following: Dec 14 13:36:03 Tower kernel: mce: [Hardware Error]: Machine check events logged Dec 14 13:36:03 Tower kernel: mce: [Hardware Error]: CPU 8: Machine Check: 0 Bank 5: bea0000000000108 Dec 14 13:36:03 Tower kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff81064b1e MISC d012000100000000 SYND 4d000000 IPID 500b000000000 Dec 14 13:36:03 Tower kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1639488944 SOCKET 0 APIC 1 microcode 8001138 Dec 14 13:36:03 Tower kernel: floppy0: no floppy controllers found Dec 14 13:36:03 Tower kernel: random: 7 urandom warning(s) missed due to ratelimiting Dec 14 13:36:03 Tower kernel: ACPI Warning: SystemIO range 0x0000000000000B00-0x0000000000000B08 conflicts with OpRegion 0x0000000000000B00-0x0000000000000B0F (\GSA1.SMBI) (20200925/utaddress-204) Dec 14 13:36:07 Tower rpc.statd[1976]: Failed to read /var/lib/nfs/state: Success What should I do going forward in order to try and diagnose the restarts? Cheers
  7. I'm on the most current 62d, I even tried downgrading to revision 61 but it seems to have the same behaviour. I've logged a ticket with Gigabyte.
  8. Hmm seems that it might be the same crash as was happening before because I just checked to revert the change and it was set back to Auto. So the BIOS setting is reverting to Auto every time. Its a Gigabyte B450 AORUS M (rev. 1.1) if anyone knows any reason as to why this option would continue to revert to Auto. Its not the CMOS as the date and time is being held in memory.
  9. Well it works to begin with but I have just reproduced the error twice. It works initially, but then becomes unresponsive over the network. It responds to ping but nothing else works/loads. The server hasn't crashed as I can still log in on the physical machine. I logged in and then didn't do anything for 10-15 minutes at which point it was unresponsive, the Syslog shows nothing after my successful login.
  10. I set the Power Supply Idle Control to Typical Current Idle as suggested in the article and rebooted. Now the server seems to be in a weird state, its apparently working ok when I check the physical screen and I can ping it but no SSH, no GUI, no docker apps. Is there something else I should do with the C-States?
  11. Hello, So first of all I know that it's not random but at the minute I can't find any rhyme or reason for the crashes. The first crash I had I checked the physical screen and it was all black with no life, nothing was working I got it working again with a reboot. This morning I woke up to see a Nginx 500 error when I tried to look at the GUI and had to reboot again. I took the server to the office this morning to try and diagnose what's happening, I updated the BIOS and restarted. I left it to idle for about an hour and it seemed to have crashed again, no SSH, no GUI, no shares but some text on the screen. I have attached diagnostics below but they were generated after the reboot. The syslog was being mirrored to flash and only has very little information before the crash. Could the kernel time error be causing this? tower-diagnostics-20211207-1127.zip syslog Any help would be much appreciated. Thanks