insomnus

Members
  • Posts

    1
  • Joined

  • Last visited

insomnus's Achievements

Noob

Noob (1/14)

0

Reputation

  1. keywords: CPU stall, starved, jiffies!, rcu_sched, kthread, runc, Hi All, SOS! My baby is slowly drifting into a coma! I am hoping for some help on this because I'm totally lost as to what is happening. A few days ago, my server started stalling and eventually, becoming completely unresponsive. Unkillable processes (dockerd/runc) choke up the machine until it can't even accept a reboot or halt command. CPU load averages shoot past even my machines theoretical maximum. Curiously htop has stopped working -- it runs but generates no output. This was briefly improved by upgrading to unRAID v6.7 but returned. Currently, the server will run for about twenty minutes before entering a series of stalls until it freezes entirely. I can provoke stalls by having plex transcode something heavy. However, the server will also eventually seize without this provocation. I've installed mcelog and have collected a syslog with (i believe) mcelog's reports on CPU stalls and kernel traces. This log corresponds to successive stalls over a two hour period until it seized entirely. The server remained unusable throughout, would fail to soft reset or shutdown, and ultimately had to be powered off manually. CPU stalls for two hours and wont shut down doesn't shut down syslog.txt Stalls from another shorter run.txt Stalls triggered by Plex Transcoder Playback.txt I run on a collection of used server gear: dual hexacore intel Xeon E5645s in a Supermicro X8DTL-I mobo with 64Gb or EEC Ram and an LSI 9201-8I HBA flashed to IT mode. The array is 8 second-hand NAS drives- 2x4gb dual parity 6x3gb storage and a pair of 650G flash drives for cache and VMs. It currently runs unRAID 6.70 stable The stalls appear in syslog are a series that look like this: Please help!