Unraid keeps crashing on Disk Clearing


Martinxds
Go to solution Solved by Martinxds,

Recommended Posts

Hi All,

 

Have an issue I was hoping to get assistance with.

I have an Unraid server recently set-up:

Unraid Server Version: 6.9.2 : AMD 3400G, ASrock B550 Steel legend, 550W PSU, SilverStone ECS06  with ASMedia1166, 32GB Kingston USB3.2

Current HDD: 5x Seagate Iron Wolf 8TB, 1x WD Red 8TB,

Current Cache: Samsung 860 Evo 1TB, Trancend 240GB

Docker Containers: Plex, SabNzbd, Sonarr, Radarr, Prowlarr, Unifi Controller, binhex-preclear

 

I am migrating from an 8 Bay Synology NAS so more HDD's to add soon.

 

Currently I have 4x Seagate 8TB Iron Wolfs set-up with 1 disk Parity. I am trying to add the other Seagate 8TB and the WD 8Tb disk. I have been doing them each separately. The clearing runs for several hours and then eventual before it finishes (I presume) the server crashes with network being unresponsive, this is normally near the end of the preclear. This has happened several times. I have tried just adding the disk to the array and let it clear in the background, I have also tried the binhex-preclear docker container.

 

I have been running it headless so once it crashes I have seen nothing. I have added both drives to the array at once this time to see what happens. I've attached the Diagnostics report. I am loving Unraid so far (once I realised all my original problems where due to a Marvell - 88SE9230 chipset RAID card) and hope I can figure out why it's crashing before I can add more disks!

 

hydra-diagnostics-20210916-0902.zip

Edited by Martinxds
Link to comment

Thanks for that. I looked into it and found this was due to my Ryzen CPU: 

so I have made that settings change. I had another crash with different error messages which indicated sdf1, so I am removing the Cache Drives for now.

 

Before any of that though adding the 2 x 8TB drives did work. Though it has crashed after, so it looks like it wasn't the clearing causing issues. The issues where just stopping the clearing from completing.

I'm running a parity check now where it has: 

Sync errors corrected:34

So once that is done I'll remove the cache drive and just let it run for awhile and test stability before adding more drives/cache etc.

Link to comment

Reading a few other threads it looks like this can be caused by heavy usage on the onboard sata controller.

Since I have a 6 port sata card and a 6 sata ports on the motherboard and plan to just use 10 Sata device in total I've moved a couple over to the card to minimize the usage on the onboard controller.

 

Lets try another party build! :(

Link to comment

Thanks from now on when it can't get the diagnostic file I'll grab the syslog :) .

 

So good(?) news. It hasn't hard crashed since yesterday morning. But today it had disappeared the shares. Attached is the Diagnostic file.

 

The logs has the line in it: Sep 26 07:40:15 Hydra kernel: shfs[2510]: segfault at 0 ip 00000000004043b8 sp 0000148eb975b780 error 4 in shfs[402000+b000]

Might indicate cache drive issue?

I had to reboot for shares to come back online.

hydra-diagnostics-20210926-0748.zip

Link to comment

Unfortunately crashed again without the ability to create a diagnostic file also means the CP command doesn't work with an error:

cp: cannot create regular file '/boot/syslog.txt' : Input/output error

 

Is this potentially a USB issue? It is a brand new USB but I'm going to backup the config and format and restore it.

 

I did set it up to write syslog to Flash. But doesn't seem to have anything usefully that I can see:

syslog

 

 

 

Link to comment

So I formatted and swapped it over to a USB 2 port. It was stable for a good day and a half, sadly just had a crash now. Diagnostics attached. I can't see anything in there indicating what the issue was. hydra-diagnostics-20210928-1051.zip

 

I was able to logon the console and create the diagnostics and issue a reboot command though. Might need to look at replacing other older equipment like PSU? Can't justify a CPU platform swap though.

Link to comment

I found a could cause it to crash and reproduce the error last night by watching a something on Plex. I then swapped power supply but it still crashed on the Plex usage. Watching it directly via SMB and it worked fine.

I have since swapped power supply with the same issue.

I have then turned Docker off and let it do a parity check last night with it crashing by the time I woke up. Trying again. I'll turn safe mode on once this is done/fails and see how we go. I did set-up Syslog mirroring a bit ago and here is is attachedsyslog (3)

 

My biggest problem I think is I can't get it to write the Diagnostic file when it crashes. the console responds but it can't write anything to the file system or seem to access the USB drive.

 

Link to comment

I Deleted the Docker Image file in case it was corrupt but no luck.

So I have swapped everything except the CPU, Mobo and RAM (and drives).

It crashed just before, I just started the Array with docker stopped and started deleting plugins and it crashed.

I can confirm it also crashes as soon as I try to play anything in Plex.

 

Anything else I can test before I look ate replacing the Mobo and CPU?

 

Worth doing a New Config?

Edited by Martinxds
Link to comment
  • 9 months later...

Awe jeez, this all sounds exactly like what I'm going through. Had to do a lot of data migration recently between disks, and I've been having loads of crashes. Can I ask what mobo/CPU you had before and after? I have an Asrock x570 pro 4 and Ryzen 3600x. I've been wanting to upgrade the CPU to either the 5800X3D or the 5900X anyway, but I was hoping to avoid a mobo swap. 

 

Also, any lingering stability issues since your laat post? 

Link to comment

Blast from the past is this thread.

I can report it has been working fine since!

 

I hade a 3400G with an Asrock B550 Steel Legend.

I replaced with a 10400 with a MSI B560M-A PRO (Supported GPU Encoding for Plex woo)

 

New information as well. About two months ago I used the old 3400G and the Asrock B550 Steel Legend to do a budget build for a family member and Windows would crash when a CPU intensive game (Stellaris) was run. Every other game that didn't tax the CPU it was fine.

This tells me the motherboard was faulty from the get go as they are meant to be compatible with an updated BIOS.

Replaced the board with a B450 of some sort and it stopped crashing Windows.

 

Reminds me I should try warranting that board now I know it was the problem - across two systems and OS's. Not that I have a spare CPU to put in it any more.

 

May not help you in your current troubleshooting as it turned out to be hardware fault so I'd encourage you to open your own thread!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.