Jump to content

Server rebooting


Go to solution Solved by Thitt0927,

Recommended Posts

afternoon all, 

 

I am having an issue where my server is consistently rebooting.  at first I suspected the flash drive, made a back up and and got a new drive, no change.  the server was running 6.12.6 so I decided to do a fresh install of unraid with the latest version it still reboots and it is pretty consistent that it reboots about 15-30 seconds after I start the array.  I then tried to boot into safe mode, it stays alive a little longer, probably 3 or 4 minutes but then rebooted.  If I reboot into safe mode and start the array in maintenance mode it stayed alive for about 5 minutes and rebooted after I started a parity check.  Attached at the diags while in maintenance mode before the last reboot, as well as the syslog from the flash drive.  any help would be greatly appreciated.  still learning unraid troubleshooting.. thanks in advance.  I would like to add as well that this server has ran flawlessly since 2022 hence why I am "new" to troubleshooting unraid.

hittflix-diagnostics-20240916-1704.zip syslog

Link to comment
1 minute ago, AceRimmer said:

If you don't have a separate syslog server then enable syslog mirroring to flash, that will write all events to the log directory and will hopefully help you help you figure out whats causing the shutdown. 

 

Just remember to turn it off again and delete the logs from your flash when you are done. 

The syslog that I attached was from mirroring to flash and it was after a reboot occured.  I just have no clue how to read it to get any valuable information

Link to comment

to add I have had it in maintenance mode since I made the first post and it has not rebooted.  now going on 15 minutes which is the most stable it has been since this started yesterday.  but have a feeling as soon as I start a parity check it will reboot again. 

Edited by Thitt0927
typo
Link to comment

The logs suggest that your Unraid server is experiencing issues related to device resets and network connectivity.

 

Areas to investigate:

 

1. SAS/SCSI Devices: Several devices are showing "Power-on or device reset occurred" messages. This could indicate a problem with the SAS controller, cabling, or power delivery to your drives. Check connections and ensure all drives and enclosures are securely connected.

 

2. Network Interface (eth0): There's a recurring "Link is Down" and "Link is Up" cycle for the eth0 interface. This could indicate network instability, a cable issue, or a problem with the network card. Verifying the network cable, switch, or network card could help.

 

3. mpt2sas Log Entries: The "mpt2sas_cm0: log_info" messages might indicate a hardware issue related to your SAS controller. Consider checking the controller's health, firmware, and updating if necessary.

Link to comment
13 minutes ago, AceRimmer said:

The logs suggest that your Unraid server is experiencing issues related to device resets and network connectivity.

 

Areas to investigate:

 

1. SAS/SCSI Devices: Several devices are showing "Power-on or device reset occurred" messages. This could indicate a problem with the SAS controller, cabling, or power delivery to your drives. Check connections and ensure all drives and enclosures are securely connected.

 

2. Network Interface (eth0): There's a recurring "Link is Down" and "Link is Up" cycle for the eth0 interface. This could indicate network instability, a cable issue, or a problem with the network card. Verifying the network cable, switch, or network card could help.

 

3. mpt2sas Log Entries: The "mpt2sas_cm0: log_info" messages might indicate a hardware issue related to your SAS controller. Consider checking the controller's health, firmware, and updating if necessary.

Thank you I will take a look at the cabling.. sorry if this is a dumb question but I have no clue how to check the health of the SAS controller.  my only experience with them is buying it, installing it and it just working LOL.

Link to comment

Check cabling going to sas controller.

 

Run lsscsi on terminal and ensure all drives are showing up.

 

Check smart status of each disk if you can.

 

My sas controller cables aren't great, a small bit of flex on the cable and my disks start throwing CRC errors. 

 

Also assuming your PSU is large enough to handle all the drives and the rest of the system? 

 

 

Link to comment
1 minute ago, AceRimmer said:

Check cabling going to sas controller.

 

Run lsscsi on terminal and ensure all drives are showing up.

 

Check smart status of each disk if you can.

 

My sas controller cables aren't great, a small bit of flex on the cable and my disks start throwing CRC errors. 

 

Also assuming your PSU is large enough to handle all the drives and the rest of the system? 

 

 

all drives are showing up,  installed new cable,  most of the drives are in a Nettapp  ds4243 with it's own PSU's.  but maybe PC PSU is failing?  I tried to run a smart report but whenever the drives spin up it reboots.. Thank your for your help so far!

Link to comment

Your scsi card is in IT mode? Do you know is it on the latest firmware?

 

Since you said "most" of your drives are in the netapp I would start by disconnecting that from your pc and try spin up what's in the pc first.

 

If that works then pull some drives out of the netapp to reduce the load and see can you get less drives to spin up on Unraid without crashing. 

Edited by AceRimmer
Link to comment
Just now, AceRimmer said:

Your scsi card is in IT mode? 

 

Since you said "most" of your drives are in the netapp I would start by disconnecting that from your pc and try spin up what's in the pc first.

 

If that works then pull some drives out of the netapp to reduce the load and see can you get less drives to spin up on Unraid without crashing. 

the card should IT mode (atleast I assume it is the ad said it was when I bought it) it has also been running well for a few years.  If i remove drives and start the array I am not at risk of losing any data correct?  just want to make sure and not get ahead of myself 

Link to comment
19 minutes ago, AceRimmer said:

Your scsi card is in IT mode? Do you know is it on the latest firmware?

 

Since you said "most" of your drives are in the netapp I would start by disconnecting that from your pc and try spin up what's in the pc first.

 

If that works then pull some drives out of the netapp to reduce the load and see can you get less drives to spin up on Unraid without crashing. 

I disconnected the Netapp and it will not let me start the array says too many disks missing

Link to comment
  • Solution

well..I think i figured it out.  kind of embarrassed I did not think of this earlier.  the only change I made was a few weeks ago I installed a new larger parity drive.  didn't cross my mind because 1 it was new 2 it had been running for weeks 3 it precleared fine.  but once I pulled the parity drive the array starts and unraid is stable again.  Guess that drive is getting RMA'd  Thank you for the replies and troubleshooting assistance 

Edited by Thitt0927
Typo
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...