(solved) Unraid becoming unresponsive only during parity checks.


Recommended Posts

For the past week or so Unraid becomes unresponsive  only during parity checks.

When i go to the server it still seems to have power to the server itself.

The only way to get it to come back is to hard reset the server.

 

I have updated bios to latest version (asrock b450 pro 4 bios v 5.0)

I have updated unraid to 6.9.2

I have changed sas to sata cables and ran test again(failed) lsi 9207-8i

i have hooked directly to motherboard sata ports (failed)

I have seen mention of c-state issues. I have changed  Power Supply Idle Current to Typical

 

I'll try to be as specific as i can on what components are in system in case diag doesnt show everything.

 

Rosewill 4U Server Chassis/Server Case/Rackmount Case, Metal Rack Mount Computer Case with 12 Hot Swap Bays & 5 Fans Pre-Installed (RSV-L4412)

 

asrock b450 pro 4

ryzen 5 3600

CORSAIR - Vengeance LPX 16GB (4 x 8GB) 3.2 GHz DDR4 DRAM Model:CMK16GX4M2B3200C16

Samsung (MZ-V7E500BW) 970 EVO SSD 500GB - M.2 NVMe Interface Internal Solid State Drive with V-NAND Technology,

2x12tb drives

2x10tb drives

LSI 9207-8i 6Gbs SAS PCI-E 3.0 HBA IT Mode For ZFS FreeNAS unRAID US

Cooler Master MWE Gold 750 Watt Fully Modular, Compact, Silent Fan 80 PLUS Gold Power Supply, MPY-7501-AFAAG-US

CyberPower CP1000AVRLCD Intelligent LCD UPS System, 1000VA/600W, 9 Outlets, AVR, Mini-Tower

1060 (6gb) gpu.

No vm's

 

Looking for any and all suggestions on where to possibly start. I also included the syslog(not sure if needed with diag) but it's the one i have mirrored to a share to try to find the culprit after reboots. From what i can tell from syslog it went down

around the Aug 13th 6:02 area, I manually rebooted around the 6:50 time.

 

Aug 13 06:02:29 Tower kernel: md: recovery thread: P corrected, sector=11929709584
Aug 13 06:50:14 Tower root: Delaying execution of fix common problems scan for 10 minutes
Aug 13 06:50:14 Tower unassigned.devices: Mounting 'Auto Mount' Devices...
Aug 13 06:50:14 Tower emhttpd: /usr/local/emhttp/plugins/user.scripts/backgroundScript.sh "/tmp/user.scripts/tmpScripts/nvidia.patch/script" >/dev/null 2>&1
Aug 13 06:50:14 Tower emhttpd: Starting services...

 

Have not done memtest as of yet. Not sure if that could be an issue if it's only happening durity parity checks.

 

 

 

 

 

tower-diagnostics-20210813-1533.zip syslog-192.168.0.229(1).log

Edited by XceRpt
Link to comment
2 minutes ago, XceRpt said:

included the syslog(not sure if needed with diag) but it's the one i have mirrored to a share to try to find the culprit after reboots.

syslog in Diagnostics is the one in RAM so nothing in it from before reboot. This is exactly the situation where we need syslog from Syslog Server attached separately from Diagnostics.

 

9 minutes ago, XceRpt said:

c-state issues. I have changed  Power Supply Idle Current to Typical

More here:

 

 

Link to comment
33 minutes ago, trurl said:

syslog in Diagnostics is the one in RAM so nothing in it from before reboot. This is exactly the situation where we need syslog from Syslog Server attached separately from Diagnostics.

 

More here:

 

 

I'm a bit confused as the one i provided was from a share on the system thats crashing. That'd the log you would want in this case correct?

taken from the troubleshooting page> Persistent Logs (Syslog server)

 

Logging to file local to Unraid server Using a bit of trickery we can use the Unraid server with the problem as the Local syslog server.

image.png.d4571c619655dd3f1d516d4af2ed9967.png

 

 

Edited by XceRpt
Link to comment
3 hours ago, XceRpt said:

I'm a bit confused

My point was that you had correctly given us a syslog from before crash. In addition to the syslog already in Diagnostics which is only since reboot.

 

Many people don't know about Syslog Server or what is in Diagnostics. So they give us Diagnostics, plus also another attachment that is simply the syslog already in the Diagnostics they attached.

 

Another thing they often attach is SMART report from one or more disks, but Diagnostics already includes SMART report for all attached disks.

 

Diagnostics also includes a lot of information about your hardware and configuration. Take a look at it, all but one file in the zip is plaintext. Some of the files are just copies of configuration files from settings you make in the webUI.

 

 

Link to comment

through putty i've tried tailing the log but still not seeing what's causing the  issue/crash. Whats weird is right now it's like the parity is stuck but "most" the webui tabs are accessible though nothing works.

 

 

image.thumb.png.294c7472ec44190c5b02c870e4a5e9a8.png

Edited by XceRpt
Link to comment
  • XceRpt changed the title to (solved) Unraid becoming unresponsive only during parity checks.

update: I cant say 100% why it's not crashing anymore. I tinkered with bios settings etc. at first bios was showing ram at 3200 . i took a couple sticks out booted back up crashed even faster. i put all 4 back in then bios showed ram running at 2133 which was the true speed non oc speed. It's been fine every since so i'll go out on a limb and saying that was more likely the problem than not.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.