Dealing with unclean shutdowns


Recommended Posts

1 hour ago, itimpi said:

That is one option.   

 

If you have the Dynamix File Manager plugin installed (or an equivalent for file managing) then you can use that to access the flash drive.

 

You can also click on the Boot device on the Main tab and go to the SMB settings to make it visible as the 'flash' share over the network.  By default it is not shared for security reasons.

 

I think I found it now. It's the file called syslog that will hopefully contain the answer? And is it safe to post the entire syslog file here or should I modify it in some ways?

 

Again, thank you so far

Link to comment
18 minutes ago, Enr379 said:

I think I found it now. It's the file called syslog that will hopefully contain the answer? And is it safe to post the entire syslog file here or should I modify it in some ways?

 

Again, thank you so far

Our normal recommendation is to post the whole file, but it is a text file so it is up to you to decide.   There should be nothing obviously sensitive in it but some people do not like sharing information that most of do not think is not sensitive.

Link to comment
1 hour ago, itimpi said:

Our normal recommendation is to post the whole file, but it is a text file so it is up to you to decide.   There should be nothing obviously sensitive in it but some people do not like sharing information that most of do not think is not sensitive.

Here it is. Hopefully you can make some sense out of it

syslog

Link to comment
  • 3 months later...
  • 1 month later...

My server had an unclean shutdown due to power loss on Sunday just gone, and I have noticed that my cache disk (Samsung 500GB 2.5" disk) is now being reported as read only. I have a single Cache drive. The server did run an automated disk parity check after unclean shutdown, which reported 0 errors. I have also run the disk check, which has returned 0 errors. The cache disk is plugged into the HBA card in IT mode, and I have swapped the data interface to a different port for bypassing any issues. But nothing has helped so far.

 

Since my cache disk and shared are being reporting as read only, I have stopped/started the array, rebooted the server and the dockers/VMs and the error does not seem to be going away. The dockers and VM cannot write to the shar or storage.

 

I have shutdown the dockers and VMs now, as they are functioning correctly, due to the storage being read only. I was thinking about getting a new cache disk, but I am not convinced if the SSD drive is the issue here, as there are no disk issues being reported.

 

Can you please advise, where shall I start to look at ?

Cheers.

 

Edited by Babar
Link to comment
9 hours ago, Babar said:

My server had an unclean shutdown due to power loss on Sunday just gone, and I have noticed that my cache disk (Samsung 500GB 2.5" disk) is now being reported as read only. I have a single Cache drive. The server did run an automated disk parity check after unclean shutdown, which reported 0 errors. I have also run the disk check, which has returned 0 errors. The cache disk is plugged into the HBA card in IT mode, and I have swapped the data interface to a different port for bypassing any issues. But nothing has helped so far.

 

Attaching diagnostics for reference here.

mug-fs-001-diagnostics-20231101-2143.zip

Link to comment

hello guys, long time user of unraid, now im experiencing unclean shutdowns...4 this month..last was today as i upgraded to 6.12.4 and rebooted to finish the upgrade, then parity check was started... i have 180sec global shutdown timeout on , 30 sec on docker, 120sec on vms, when i stop the array everything stops in a minute more all less..so pleaaaaseeee help me im posting diagnostics and the syslog latest file...

 

 

 

 

diagnostics-20231109-2051.zip unraid log.zip

Link to comment
7 hours ago, JorgeB said:

 

Nov  9 16:58:50 Umbrella emhttpd: shcmd (827): umount /var/lib/docker
Nov  9 16:58:50 Umbrella root: umount: /var/lib/docker: target is busy.

 

Docker didn't unmount, v6.12.4 should fix this.

thank you so much for your time to look at this 🙃

Edited by tgiannak
Link to comment
  • 4 weeks later...
  • 1 month later...

After upgrading last week I am having some issues keeping my server up and running.  It seems to load with no issues, but after a unspecific amount of time I find that it has powered off.  I have not seen anything that indicates what the underlying problem is, so I am posting my diagnostics here to see if someone can spot what I am obviously missing.

 

The only thing that I can see is the following: 

Jan 23 13:26:33 Tower wsdd2[22229]: 'Terminated' signal received.

 

I disabled the WSD option in my samba settings, and will spin everything back up to see if I see any difference, but didn't think of trying this earlier due to all the older posts saying it had been resolved with the update to wsdd2 vs wsdd

 

 

Thanks in advance,

 

tower-diagnostics-20240123-1540.zip

Edited by mykpoz
Link to comment
22 hours ago, itimpi said:

The system powering itself off almost invariably indicates a hardware issue with the most common culprits being the CPU overheating or the PSU struggling (unless you are explicitly running something to power it off).

 

I was thinking the same thing, but was hoping to have seen something in the logs indicating that.  I have replaced the AIO CPU cooler which doesn't look like it was really being used correctly, with the stock AMD cooler.  I didn't feel like running a VM just to control the radiator.  The temps are about half and it's been running for 4 hours where normally it was turning off within 20 minutes.

 

Thanks again!

Link to comment
1 minute ago, mykpoz said:

I was thinking the same thing, but was hoping to have seen something in the logs indicating that.  I have replaced the AIO CPU cooler which doesn't look like it was really being used correctly, with the stock AMD cooler.  I didn't feel like running a VM just to control the radiator.  The temps are about half and it's been running for 4 hours where normally it was turning off within 20 minutes.

 

Thanks again!

I was on the same boat for months, unclean shutdown, freezes, forced reboots, replaced, USB Thumb drive, stopped Docker and VMs, on and on and only to be fixed by a single line in the go file, why was this it? Why is AMD susceptible to this issue? 

 

#!/bin/bash
# Start the Management Utility
/usr/local/sbin/zenstates --c6-disable
/usr/local/sbin/emhttp &

 

Since I disabled the zenstates and also doing that in CMOS/BIOS my server hasn't had any issues, for a week or so now, where as before, it would be a hourly or daily events.

 

Didn't even know I could have looked for this but searching further, multiple posts regarding this and AMD, a bit old but you'd think all these plugins would notify you to check on it i.e. tips and tricks, common problems, base os, etc.

 

Glad I fixed it because it was becoming a burden and regret jumping into Unraid and not the out of box NAS boxes but it did take premium support to narrow it down, but worth every penny. 

--------------------------------

 

 

 

Link to comment

Hello there,

 

I hope I'm posting this in the correct thread—if not, please correct me.

My Unraid system freezes approximately once a day, to the extent that I have to press the power button for a few seconds.

 

I changed some Hard Drives recently and also restored my libvirt.img because is accedantly deleted it (1 Day old Backup - no changes made after the Backup).

 

I'm mirroring my syslog to my Unraid Stick.

 

I encounter the following error in the log multiple times a day, and sometimes the server simply freezes.

 

I've checked numerous aspects, but I'm unable to pinpoint the issue.

I also changed from mcvlan to ipvlan because my log mentioned it. 

I'm hopeful that you guys might have some insights.

 

Thank you very much!

 

Quote

Datenbunker kernel: CPU 11/KVM: page allocation failure: order:4, mode:0xdc0(GFP_KERNEL|__GFP_ZERO), nodemask=(null),cpuset=vcpu11,mems_allowed=0

Datenbunker kernel: CPU: 16 PID: 429 Comm: CPU 11/KVM Tainted: P W O 6.1.64-Unraid #1

 

 

 

 

Log_Kernel_Error.txt datenbunker-diagnostics-20240126-1938.zip

Log_Kernel_Error_2.txt

Edited by anthem221
  • Upvote 2
Link to comment

Sadly yes. I passed the BD the old way to a docker, need to do it the new way with add device in the docker settings. 

I had unplugged it for three days and still had the freezes. 

I will disconnect it again till it's repaired, probably the best.

 

I had a lot of log entry's that something is wrong, I fixed everything I could but the original failure still remains. 

Edited by anthem221
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.