Martinxds Posted September 15, 2021 Share Posted September 15, 2021 (edited) Hi All, Have an issue I was hoping to get assistance with. I have an Unraid server recently set-up: Unraid Server Version: 6.9.2 : AMD 3400G, ASrock B550 Steel legend, 550W PSU, SilverStone ECS06 with ASMedia1166, 32GB Kingston USB3.2 Current HDD: 5x Seagate Iron Wolf 8TB, 1x WD Red 8TB, Current Cache: Samsung 860 Evo 1TB, Trancend 240GB Docker Containers: Plex, SabNzbd, Sonarr, Radarr, Prowlarr, Unifi Controller, binhex-preclear I am migrating from an 8 Bay Synology NAS so more HDD's to add soon. Currently I have 4x Seagate 8TB Iron Wolfs set-up with 1 disk Parity. I am trying to add the other Seagate 8TB and the WD 8Tb disk. I have been doing them each separately. The clearing runs for several hours and then eventual before it finishes (I presume) the server crashes with network being unresponsive, this is normally near the end of the preclear. This has happened several times. I have tried just adding the disk to the array and let it clear in the background, I have also tried the binhex-preclear docker container. I have been running it headless so once it crashes I have seen nothing. I have added both drives to the array at once this time to see what happens. I've attached the Diagnostics report. I am loving Unraid so far (once I realised all my original problems where due to a Marvell - 88SE9230 chipset RAID card) and hope I can figure out why it's crashing before I can add more disks! hydra-diagnostics-20210916-0902.zip Edited September 16, 2021 by Martinxds Quote Link to comment
JorgeB Posted September 16, 2021 Share Posted September 16, 2021 Make sure power supply control is set to idle in the BIOS, see the pinned FAQ thread for more info. Quote Link to comment
Martinxds Posted September 19, 2021 Author Share Posted September 19, 2021 Thanks for that. I looked into it and found this was due to my Ryzen CPU: so I have made that settings change. I had another crash with different error messages which indicated sdf1, so I am removing the Cache Drives for now. Before any of that though adding the 2 x 8TB drives did work. Though it has crashed after, so it looks like it wasn't the clearing causing issues. The issues where just stopping the clearing from completing. I'm running a parity check now where it has: Sync errors corrected:34 So once that is done I'll remove the cache drive and just let it run for awhile and test stability before adding more drives/cache etc. Quote Link to comment
Martinxds Posted September 23, 2021 Author Share Posted September 23, 2021 Still hunting down the issue. I recently upgraded to the 6.10 release Candidate as I was being affected by this issues: However it crashed again today. Attached is the latest Diagnostic. This time it was running with no Docker containers, clearing or Parity checks running. Just a mounted drive copying to it via sata card. hydra-diagnostics-20210923-1803.zip Quote Link to comment
JorgeB Posted September 23, 2021 Share Posted September 23, 2021 There was a problem with the onboard SATA controller: Sep 23 08:29:53 Hydra kernel: ahci 0000:02:00.1: AHCI controller unavailable! This is quite common with some Ryzen boards, look for a BIOS update or use an add-on controller. Quote Link to comment
Martinxds Posted September 23, 2021 Author Share Posted September 23, 2021 Ack! It's a brand new board and I thought I did a Bios update. Let me check. Thanks! Quote Link to comment
Martinxds Posted September 23, 2021 Author Share Posted September 23, 2021 Yep certainly has the latest Bios on it. I'm at a loss here. Don't suppose you know of any other settings that would help me get this board stable? Quote Link to comment
Martinxds Posted September 23, 2021 Author Share Posted September 23, 2021 Reading a few other threads it looks like this can be caused by heavy usage on the onboard sata controller. Since I have a 6 port sata card and a 6 sata ports on the motherboard and plan to just use 10 Sata device in total I've moved a couple over to the card to minimize the usage on the onboard controller. Lets try another party build! Quote Link to comment
JorgeB Posted September 23, 2021 Share Posted September 23, 2021 Like mentioned it's quite common with some Ryzen boards, if there's no BIOS update that helps not much else you can do except avoiding that controller. Quote Link to comment
Martinxds Posted September 23, 2021 Author Share Posted September 23, 2021 Thanks for the assistance. At this point it's cheaper and less effort to just buy another Sata card rather than swapping out the motherboard so I will go down that route. I'll check back in a week or two and let you know how I went Quote Link to comment
Martinxds Posted September 24, 2021 Author Share Posted September 24, 2021 2nd Sata Card purchased. Will try a parity rebuild now. Fingers crossed! Quote Link to comment
Martinxds Posted September 24, 2021 Author Share Posted September 24, 2021 (edited) And sadly another crash and another time I can't create a diagnostic file: Edited September 24, 2021 by Martinxds Quote Link to comment
JorgeB Posted September 24, 2021 Share Posted September 24, 2021 At least get the syslog: cp /var/log/syslog /boot/syslog.txt Quote Link to comment
Martinxds Posted September 25, 2021 Author Share Posted September 25, 2021 Thanks from now on when it can't get the diagnostic file I'll grab the syslog . So good(?) news. It hasn't hard crashed since yesterday morning. But today it had disappeared the shares. Attached is the Diagnostic file. The logs has the line in it: Sep 26 07:40:15 Hydra kernel: shfs[2510]: segfault at 0 ip 00000000004043b8 sp 0000148eb975b780 error 4 in shfs[402000+b000] Might indicate cache drive issue? I had to reboot for shares to come back online. hydra-diagnostics-20210926-0748.zip Quote Link to comment
Martinxds Posted September 26, 2021 Author Share Posted September 26, 2021 Unfortunately crashed again without the ability to create a diagnostic file also means the CP command doesn't work with an error: cp: cannot create regular file '/boot/syslog.txt' : Input/output error Is this potentially a USB issue? It is a brand new USB but I'm going to backup the config and format and restore it. I did set it up to write syslog to Flash. But doesn't seem to have anything usefully that I can see: syslog Quote Link to comment
JorgeB Posted September 26, 2021 Share Posted September 26, 2021 6 hours ago, Martinxds said: Is this potentially a USB issue? Failing to copy the syslog possibly, make sure you're using a USB 2.0 port, server crashing appears more hardware related, I would always go with Intel for Unraid, though some are using Ryzen based servers reliably. Quote Link to comment
Martinxds Posted September 28, 2021 Author Share Posted September 28, 2021 So I formatted and swapped it over to a USB 2 port. It was stable for a good day and a half, sadly just had a crash now. Diagnostics attached. I can't see anything in there indicating what the issue was. hydra-diagnostics-20210928-1051.zip I was able to logon the console and create the diagnostics and issue a reboot command though. Might need to look at replacing other older equipment like PSU? Can't justify a CPU platform swap though. Quote Link to comment
JorgeB Posted September 28, 2021 Share Posted September 28, 2021 Don't see any issues logged, one more thing you can try it to boot the server in safe mode with all docker/VMs disable, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
Martinxds Posted September 29, 2021 Author Share Posted September 29, 2021 I found a could cause it to crash and reproduce the error last night by watching a something on Plex. I then swapped power supply but it still crashed on the Plex usage. Watching it directly via SMB and it worked fine. I have since swapped power supply with the same issue. I have then turned Docker off and let it do a parity check last night with it crashing by the time I woke up. Trying again. I'll turn safe mode on once this is done/fails and see how we go. I did set-up Syslog mirroring a bit ago and here is is attachedsyslog (3) My biggest problem I think is I can't get it to write the Diagnostic file when it crashes. the console responds but it can't write anything to the file system or seem to access the USB drive. Quote Link to comment
Martinxds Posted September 29, 2021 Author Share Posted September 29, 2021 Running a Memtest tonight. Quote Link to comment
Martinxds Posted September 30, 2021 Author Share Posted September 30, 2021 Memtest found nothing. Reseated Ram in different slots anyway. Swapped USB and it still crashed. Ran it today in safe mode without docker running and got a random crash sadly. Quote Link to comment
Martinxds Posted September 30, 2021 Author Share Posted September 30, 2021 (edited) I Deleted the Docker Image file in case it was corrupt but no luck. So I have swapped everything except the CPU, Mobo and RAM (and drives). It crashed just before, I just started the Array with docker stopped and started deleting plugins and it crashed. I can confirm it also crashes as soon as I try to play anything in Plex. Anything else I can test before I look ate replacing the Mobo and CPU? Worth doing a New Config? Edited September 30, 2021 by Martinxds Quote Link to comment
Solution Martinxds Posted October 3, 2021 Author Solution Share Posted October 3, 2021 So out of sheer desperation I replaced the CPU and Motherboard. And I am looking stable at over 24 hours uptime running all containers and using Plex extensively. Now this travesty is done, I'll try to enjoy Unraid now! Quote Link to comment
nicus Posted July 18, 2022 Share Posted July 18, 2022 Awe jeez, this all sounds exactly like what I'm going through. Had to do a lot of data migration recently between disks, and I've been having loads of crashes. Can I ask what mobo/CPU you had before and after? I have an Asrock x570 pro 4 and Ryzen 3600x. I've been wanting to upgrade the CPU to either the 5800X3D or the 5900X anyway, but I was hoping to avoid a mobo swap. Also, any lingering stability issues since your laat post? Quote Link to comment
Martinxds Posted July 19, 2022 Author Share Posted July 19, 2022 Blast from the past is this thread. I can report it has been working fine since! I hade a 3400G with an Asrock B550 Steel Legend. I replaced with a 10400 with a MSI B560M-A PRO (Supported GPU Encoding for Plex woo) New information as well. About two months ago I used the old 3400G and the Asrock B550 Steel Legend to do a budget build for a family member and Windows would crash when a CPU intensive game (Stellaris) was run. Every other game that didn't tax the CPU it was fine. This tells me the motherboard was faulty from the get go as they are meant to be compatible with an updated BIOS. Replaced the board with a B450 of some sort and it stopped crashing Windows. Reminds me I should try warranting that board now I know it was the problem - across two systems and OS's. Not that I have a spare CPU to put in it any more. May not help you in your current troubleshooting as it turned out to be hardware fault so I'd encourage you to open your own thread! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.