Jump to content

Server has been hanging and becoming non-responsive


Recommended Posts

As the title says, on Saturday, the server started acting up. I could access the unraid UI but not really od anything. Rebooting resulted in nothing. I could access the terminal but I couldn't get the reboot command to work there either. CPU was pegged at 100%. I had to hard boot it.

 

Parity check passed fine after brining the array online.

 

Tonight, I came home after a few hours away and I couldn't even access the unraid UI through web browser and I couldn't ping the box (so obviously I couldn't ssh in either.) iDrac showed no issues of note (but I could reach the idrac on the box) and I couldn't wake up the box to get direct video out. Naturally, I'm leaning to a dreaded hardware issue.

 

I've attached Unraid logs... Hopefully someone more well versed than I can see a glaring issue in here?

 

Syslogs attached

kuiper-diagnostics-20230926-2310.zip

Link to comment
5 hours ago, JorgeB said:

Correct, make sure you set the Unraid server IP in the remote server filed, it's a common mistake.

I'm all about the common mistakes... like this one... that I made. :D Thanks.

 

4 hours ago, itimpi said:

It is often easier to set the “mirror to flash” option to get the output written to the ‘logs’ folder on the flash drive.

So this will write them to my boot drive too then (being the flash drive)? Sounds good.

Link to comment
15 minutes ago, JorgeB said:

Both logs only cover a few minutes, at what time was the crash?

Unknown. Sometime between 11pm and 630ish. I came in this morning and it was unusable. It became unusable overnight.

 

I'm seeing several very similar threads. Going through them, looks like pretty identical symptoms. Hoping that, between all the threads, some light can come as to what may be causing it.

Edited by wes.crockett
Link to comment
35 minutes ago, JorgeB said:

Both logs only cover a few minutes

Forget that, I missed the first couple of lines, since I was expecting more activity logged before the crash:

 

Sep 27 15:12:16 Kuiper monitor: Stop running nchan processes
Sep 27 21:21:14 Kuiper webGUI: Successful login user root from 192.168.1.162
Sep 28 06:40:04 Kuiper kernel: Linux version 6.1.49-Unraid (root@Develop-612) (gcc (GCC) 12.2.0, 

 

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

  • Like 1
Link to comment

I just ran /sbin/reboot as a user script. Server rebooted fine but array didn't come back online on its own. Is that normal behavior when running /sbin/reboot?

 

My thinking is trying nightly reboots to see if the issue persists every now and then.

 

EDIT:

I'm dumb... that just flat out reboots Linux. Is there a prebuilt script for safely rebooting Unraid in the proper order?

Edited by wes.crockett
Link to comment
17 minutes ago, wes.crockett said:

Is that normal behavior when running /sbin/reboot?

No.

 

17 minutes ago, wes.crockett said:

EDIT:

I'm dumb... that just flat out reboots Linux. Is there a prebuilt script for safely rebooting Unraid in the proper order?

Reboot works with Unraid, it will start a clean shutdown then reboot.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...