Dealing with unclean shutdowns


Recommended Posts

47 minutes ago, jonathanm said:

That sounds like something @limetech ought to at least investigate.

 

Thanks. I thought I'd ask because it seems so obvious and repeatable that I thought it might just be a "feature". I'd file a bug report if I could. I'm getting a 404 error at the moment though - already mentioned in the "Welcome to unraid.net" thread.

Link to comment
  • 4 months later...
  • 4 weeks later...

I've had to do a couple "power off shutdowns". A problem with the ethernet port of the supermicro mobo, rendered

the server unavailable via the normal admin webpage or SSH/telnet. It's a big Norco array of 20 disks, so at some point

I just had to "flip the switch". A few other attempts to re-establish connectivity similarly failed.

 

I then did a mobo replacement, so now the ethernet config file for the unraid kernel needs to be changed (I'll use nano on a raspberry pi to

do this-reset to default on the USB stick). What should I do or expect when the machine boots? I don't have backups, so if the array comes

up damaged, I don't want to do  more damage. What are the recommendations for recovering raid integrity at this point?

Link to comment

It should just boot up and recognize your disks assuming your flash drive is OK. You can just delete config/network.cfg and it will create a new default one. You might edit config/disk.cfg to turn autostart off just so you can check the assignments before starting the array.

Link to comment
  • 3 months later...

Perhaps I am missing a big piece of the issue of an unclean shutdown.

 

A power failure will cause an unclean shutdown. This will create massive parity errors.

 

Does that mean if I lose a data drive in a power failure and parity is now erroneous, I won't be able to reconstruct my damage drive?

Link to comment
14 minutes ago, tunetyme said:

Perhaps I am missing a big piece of the issue of an unclean shutdown.

 

A power failure will cause an unclean shutdown. This will create massive parity errors.

 

Does that mean if I lose a data drive in a power failure and parity is now erroneous, I won't be able to reconstruct my damage drive?

It depends on the power failure.    Typically it will only be a few and they are likely to refer to any write in progress at the time of failure.

Link to comment
  • 3 months later...
12 minutes ago, RxLord said:

Please help...I am having issues with my Unraid system and unclean shutdowns.  I am not a computer programmer. Any help or recommendations are greatly appreciated. Thanks in advance.

tower-diagnostics-20190803-0233.zip 180.6 kB · 0 downloads

Please describe in detail exactly what the circumstances are that involved in your unclean shutdowns.  (Server 'crashing', normal attempt to shutdown the server, etc.) 

Link to comment

Server is rebooting/crashing on it own. if I look at my terminal it says "kernel panic not syncing" then i goes thru the normal reboot process. When it reboot I get the unclean error so I reboot the system and restart the array. If I don't reboot the system, parity check will start and it reboots again without completing the parity check.

Link to comment
7 hours ago, RxLord said:

Server is rebooting/crashing on it own. if I look at my terminal it says "kernel panic not syncing" then i goes thru the normal reboot process. When it reboot I get the unclean error so I reboot the system and restart the array. If I don't reboot the system, parity check will start and it reboots again without completing the parity check.

You are running 6.7.2 which has a new feature for  use in cases like this called "Syslog Server".  To use it go to   Settings   >>   Syslog Server   and Enable 'Mirror syslog to flash:'.    (The 'Help' feature in the GUI will provide you with more information.)  That file may provide more information to the real Gurus.  Did you do any hardware upgrades just before this all started?  Any other system  configuration or modifications?

Link to comment
  • 6 months later...

Unfortunately I've been struggling with a few different issues. It started with my cache drive filling up which I think I figured out was due to a large VM with large file transfers filling it up. That would crash my system. So I ordered a new cache drive an nvme that's 2TB instead of the 1TB I had. So I replaced the cache drive which came with it's own set of issues where the old cache drive had bad blocks so I did an emergency recovery and salvaged what data I could onto the new cache drive. I also noticed that the BIOS on my board Asrock B450 & Ryzen 1600 had reset to default config, yes lots of variables unfortunately.

 

Most recently what I'm troubleshooting is after installing the new drive I've attempted to run a parity check twice but have had a hard lockup before it could complete. It's been taking two days to run and I usually wake up to an unresponsive system. This morning I decided to turn on the server and watch the syslog. Here is the error that kept reoccuring before it hard locked up again this morning.

 

Jul 14 07:26:34 Unraid kernel: CPU: 9 PID: 28040 Comm: awk Tainted: G D W 4.19.107-Unraid #1 Jul 14 07:26:34 Unraid kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Pro4, BIOS P1.80 12/18/2018 Jul 14 07:26:34 Unraid kernel: RIP: 0010:__x86_indirect_thunk_rax+0x3/0x20 Jul 14 07:26:34 Unraid kernel: Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f ae e8 <ff> e0 0f 1f 84 00 00 00 00 00 0f 1f 40 00 66 66 2e 0f 1f 84 00 00 Jul 14 07:26:34 Unraid kernel: RSP: 0018:ffffc90008f07cb0 EFLAGS: 00010286 Jul 14 07:26:34 Unraid kernel: RAX: 81c6a760ffffffff RBX: ffffffff81e6aca4 RCX: 0000000000000000 Jul 14 07:26:34 Unraid kernel: RDX: 0000000000000c97 RSI: 0000000000000000 RDI: ffff8887f6312038 Jul 14 07:26:34 Unraid kernel: RBP: ffff8887f6312038 R08: ffffffff81d5e179 R09: 0000000000000001 Jul 14 07:26:34 Unraid kernel: R10: 0000000000000004 R11: ffff8884f61af369 R12: ffff8887f9dbe640 Jul 14 07:26:34 Unraid kernel: R13: 000000000001507a R14: 000000000000001f R15: 0000000000000000 Jul 14 07:26:34 Unraid kernel: FS: 0000147c83a4aa80(0000) GS:ffff8887fec40000(0000) knlGS:0000000000000000 Jul 14 07:26:34 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 14 07:26:34 Unraid kernel: CR2: 00000000006c74a8 CR3: 00000004dc7c6000 CR4: 00000000003406e0 Jul 14 07:26:34 Unraid kernel: Call Trace: Jul 14 07:26:34 Unraid kernel: ? kobject_put+0x78/0x8f Jul 14 07:26:34 Unraid kernel: ? disk_part_iter_next+0x19/0xb2 Jul 14 07:26:34 Unraid kernel: ? diskstats_show+0x4d/0x4a5 Jul 14 07:26:34 Unraid kernel: ? klist_next+0x89/0xa8 Jul 14 07:26:34 Unraid kernel: ? seq_read+0x231/0x313 Jul 14 07:26:34 Unraid kernel: ? proc_reg_read+0x3b/0x59 Jul 14 07:26:34 Unraid kernel: ? __vfs_read+0x32/0x132 Jul 14 07:26:34 Unraid kernel: ? vfs_read+0xa4/0x124 Jul 14 07:26:34 Unraid kernel: ? ksys_read+0x60/0xb2 Jul 14 07:26:34 Unraid kernel: ? do_syscall_64+0x57/0xf2 Jul 14 07:26:34 Unraid kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jul 14 07:26:34 Unraid kernel: Modules linked in: macvlan xt_nat xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost tap veth ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod bonding mlx4_en mlx4_core r8169 realtek pcc_cpufreq edac_mce_amd kvm_amd wmi_bmof kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd i2c_piix4 ahci i2c_core libahci mpt3sas glue_helper ccp k10temp raid_class scsi_transport_sas nvme nvme_core wmi button [last unloaded: mlx4_core] Jul 14 07:26:34 Unraid kernel: ---[ end trace d1d2a59faafdc2ec ]--- Jul 14 07:26:34 Unraid kernel: RIP: 0010:0xffff8887fad1d480 Jul 14 07:26:34 Unraid kernel: Code: 00 00 00 00 00 00 00 00 00 00 00 25 ea 81 ff ff ff ff d8 b5 31 fb 87 88 ff ff b5 05 00 00 03 00 00 00 40 a7 c6 81 ff ff ff ff <80> d4 d1 fa 87 88 ff ff 80 d4 d1 fa 87 88 ff ff 00 00 00 00 00 00 Jul 14 07:26:34 Unraid kernel: RSP: 0018:ffffc900098cfc48 EFLAGS: 00010282 Jul 14 07:26:34 Unraid kernel: RAX: ffff8887fad1d480 RBX: ffffffff81e6aca4 RCX: 0000000000000000 Jul 14 07:26:34 Unraid kernel: RDX: 0000000000000000 RSI: ffff8887f6312038 RDI: ffff8887fad1d428 Jul 14 07:26:34 Unraid kernel: RBP: ffff8887f6312038 R08: ffffffff81d5e179 R09: 0000000000000001 Jul 14 07:26:34 Unraid kernel: R10: 0000000000000004 R11: ffff888779cc0b80 R12: ffff8887f6312038 Jul 14 07:26:34 Unraid kernel: R13: 0000000000014ff2 R14: ffff8887fad1d428 R15: ffff8887fad1d480 Jul 14 07:26:34 Unraid kernel: FS: 0000147c83a4aa80(0000) GS:ffff8887fec40000(0000) knlGS:0000000000000000 Jul 14 07:26:34 Unraid kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 14 07:26:34 Unraid kernel: CR2: 00000000006c74a8 CR3: 00000004dc7c6000 CR4: 00000000003406e0

syslog.txt

Edited by mc_866
Added full log
Link to comment
  • 1 year later...
  • 4 weeks later...

i did a normal reboot via the Gui on unraid and i got the warning

 

Parity Check Tuning
Automatic unRaid Non-Correcting Parity Check will be started
1629649753
Unclean shutdown detected
warning

 

i dont have any VM currently running they are off and only 6 dockers online.

Link to comment
55 minutes ago, KoNeko said:

i did a normal reboot via the Gui on unraid and i got the warning

 

Parity Check Tuning
Automatic unRaid Non-Correcting Parity Check will be started
1629649753
Unclean shutdown detected
warning

 

i dont have any VM currently running they are off and only 6 dockers online.


did an automatic check actually start?    The reason I am asking is that in testing I occasionally get the plugin outputting that warning and the automatic check does not actually start.    I have not identified the exact conditions where the plugin thinks it was an unclean shutdown but UnRaid decides it was not.

Link to comment
3 minutes ago, itimpi said:


did an automatic check actually start?    The reason I am asking is that in testing I occasionally get the plugin outputting that warning and the automatic check does not actually start.    I have not identified the exact conditions where the plugin thinks it was an unclean shutdown but UnRaid decides it was not.

yes the partiy check did start automaticly.

Link to comment
1 minute ago, KoNeko said:

yes the partiy check did start automaticly.

Thanks for letting me know.

 

You should only get an unclean shutdown if one of the shutdown timeouts trigger.  I would suggest you try hitting the button to Stop the array and time how long it takes.   You need to make sure the Disk Settings -> shutdown timeout setting is longer than that.

Link to comment
13 minutes ago, itimpi said:

Thanks for letting me know.

 

You should only get an unclean shutdown if one of the shutdown timeouts trigger.  I would suggest you try hitting the button to Stop the array and time how long it takes.   You need to make sure the Disk Settings -> shutdown timeout setting is longer than that.

i did increase the timing somewhat but that is a good idea to time that.

 

Will do that once the check is done in a day or 2 or so

 

Link to comment
1 minute ago, KoNeko said:

i did increase the timing somewhat but that is a good idea to time that.

 

Will do that once the check is done in a day or 2 or so

 

 

My experience is that any errors in such a situation as yours tend to be at the start of the parity check so it is not always necessary to run it to the end.

 

Link to comment
  • 2 weeks later...

Hello there

 

i have some major problem with my new server setup.

server in GUI mode and successfully setup two win10 vms with gpu passtrough.

3 external USB to NVME drives out of two is on the same controller and passtrough to Win10 vm #1 and one ext drive is showed in unraid as unassigned device with the "pass" option enabled and used as a vitrual drive on the Win10 VM #2.

 

at rebooting first the reboot splash screen appears time counting until localhost is losing the connecting then nothing happends. 

i can still access the terminal window and using command "shutdown -t 5 now" forces the server to action but only until its trying to allocate the unraid usb flash drive and then nothing, forceing me to hard reset the server everytime.

 

can the problem be with the external drive on the controller managed by unraid as rebooting the server with that drive disconnected works?

 

issue also posted for support in the unnassigned devices plugin forum

 

very much apprecietad if someone can have a look at the diagnostic file 

galactica-diagnostics-20210831-1030.zip

Edited by TIE Fighter
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.