Jump to content
LAST CALL on the Unraid Summer Sale! 😎 ⌛ ×

Unraid Dies, Can't Reach It, Causes Network Outage


Recommended Posts

This has happened several times now in the past few weeks and I can't figure out a commonality in all the shutdowns so I've come here for help. 

 

My server will become unresponsive randomly. When it does this it doesn't shut the PC down but I also can no longer reach it on my network. Somehow it kills my network and I can't get anything back online until I unplug the ethernet either on the tower or on the switch. I can plug it back in after that and everything is fine on my network but I still can't reach my tower and the only way to get it back is a hard reboot. I tried plugging an HDMI into a monitor when it's in this state but I haven't been able to get picture out of the tower past the splash screen ever since I set it up so I didn't think much of it.

 

I tried changing some configs on it, I installed the fix common problems plugin and implemented all the suggestions. It's a very recent build with all new components that I tested thoroughly before assembling. I am at a loss so I started getting the log copied to flash and when it crashed most recently I grabbed the logs and could use some help deciphering what useful info there may be. I don't see a way to attach files to this post so if someone can let me know how to do that or if I need to just copy it to a pastebin I'd appreciate it.

Link to comment

I'm not aware of any overclocked RAM. I made no changes to overclock it out of the box. It's just ddr5-5600 ram with a ryzen 7950x so it might seem like the 5600mhz is overclocked. Does it say somewhere in the diags it's overclocked?

2 hours ago, JorgeB said:

Btrfs is detecting data corruption on both pool devices

 

Where are you seeing that?

Link to comment
11 hours ago, MyNameWasTaken said:

I'm not aware of any overclocked RAM

I forgot that 7000 is DDR5, according to the diags RAM is running at 3600MT/s, so well withing spec, I don't recall if maximum officially supported speed is 5200 or 5600 for these.

 

11 hours ago, MyNameWasTaken said:

Where are you seeing that?

 

Jan 22 17:24:38 tower kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 47, gen 0
Jan 22 17:24:38 tower kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 29, gen 0

During mount you see any corruption previously detected for this filesystem.

Link to comment
2 hours ago, JorgeB said:

 

Jan 22 17:24:38 tower kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 47, gen 0
Jan 22 17:24:38 tower kernel: BTRFS info (device nvme0n1p1): bdev /dev/nvme1n1p1 errs: wr 0, rd 0, flush 0, corrupt 29, gen 0

During mount you see any corruption previously detected for this filesystem.

Okay thank you for pointing that out so I can maybe recognize it in the future. What do you suggest? Run memtest again to verify the ram and if that passes flip that BIOS setting or adjust c states like in your linked comment? I ran data scrubbing and didn’t find any errors on the cache. Are those drives maybe the issue and need to be replaced? I just got them so they’d definitely be under warranty. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...