Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Frustration Has Set In - Help Please

Featured Replies

Good Afternoon – I’ve reached the point where I’ve just decided to ask for help or guidance here.

 

I’ve had an Unraid server for approximately 3 years with little to no issues over that time. At least not anything I couldn’t figure out myself.

 

Approximately 10-12 months ago, my server just started randomly not responding. Essentially just locking up. I would have to power cycle to reboot and bring it back online. Of course, then a Parity check would ensue upon coming back online. The server would never last more than 24-30 hours before locking up again.

 

As I said, I suffered thru this for the last year until about 3 weeks ago when I just couldn’t take it any longer.

 

I decided to completely install Unraid from scratch.  I pulled my key off the flash drive, pulled all the data drives out. I took a deep breath and just dove right in.

 

First thing I wanted to do was a MemTest, so I ran 4 passes on the RAM without any failures. BIOS was confirmed up to date. I then started building the array. I began with 2 Parity drives and then would add my data drives one at a time.

 

I did not want to keep the configuration of the old array, and just decided that for me, it would be just as easy to mount an Unassigned drive, copy data to the array, then add that Unassigned drive to the array. Essentially shuffle my data very slowly back in. Is this efficient, probably not, should it work without a problem, yes.

 

But we come full circle back to the Unraid server just not staying “Active” and becoming unresponsive. While copying data or trying the pre-clear a drive, the server just becomes unreachable. The GUI seems to be active at times, but if you try to make any changes or stop the array, it just spins.

 

Attached to the post is my SysLog from a reboot to the lock up. In addition, I’ve captured a picture of the Server monitor show the last portion where it flashes up some errors.

 

Done some basic research and I do have an AMD Ryzen CPU and I have turned C-States on and off, the same results occur no matter the settings.

 

I know this isn’t an Unraid issue per say, and it most certainly is on my end, but I’m at a point where I just want to build a Windows box with JBODs via USB.

 

If anyone has any insight or suggestions, I would be indebted to you.

 

Thank you

-Keelhaulers

****************************

UNRAID: Version: 6.10.3 

 

CPU: AMD Ryzen 7 3700X 8-Core @ 3600 MHz

 

Motherboard: Gigabyte Technology Co., Ltd. X570 AORUS MASTER

BIOS: American Megatrends International, LLC., Version F36e, BIOS dated: Thursday, May 12, 2022

 

RAM: 64GB DDR4

GPU: NVIDIA GeForce GTX 1050 Ti           

 

 

 

 

Capture.JPG

syslog-192.168.1.18.log

IMG_1754.jpg

Edited by Keelhaulers
Added Unraid version information. Picture fix

  • Author

Just happened again after being up for 1 hour. Figured I'd add the latest syslog, just in case it shows something different.

 

-Keelhaulers

 

 

syslog-192.168.1.18 (Take 2).log

It looks like a disk controller issue:

Aug 11 13:38:19 THOR kernel: mpt3sas 0000:05:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM
Aug 11 13:38:19 THOR kernel: r8169 0000:08:00.0: invalid VPD tag 0x00 (size 0) at offset 0; assume missing optional EEPROM
Aug 11 13:38:21 THOR unassigned.devices: Mounting 'Auto Mount' Remote Shares...
Aug 11 13:38:30 THOR kernel: mdcmd (37): nocheck cancel
Aug 11 13:38:30 THOR kernel: md: recovery thread: exit status: -4
Aug 11 13:43:14 THOR kernel: smartctl[1871]: segfault at 146a1f38 ip 0000152cab3776ea sp 00007fffd8de7940 error 6 in libc-2.33.so[152cab2b0000+15e000]
Aug 11 13:43:14 THOR kernel: Code: 00 00 00 83 f8 0f 0f 84 11 18 00 00 48 8b 4b 70 8d 50 01 48 8b 7c 24 08 48 c1 e0 06 89 93 80 00 00 00 66 0f ef c0 48 8d 14 01 <0f> 11 44 01 08 48 8d 74 01 08 48 c7 42 18 00 00 00 00 f3 0f 6f 3f
Aug 11 13:45:46 THOR kernel: smartctl[12254]: segfault at 38 ip 00001475db5e6a1c sp 00007ffdaa770790 error 4 in libc-2.33.so[1475db51c000+15e000]
Aug 11 13:45:46 THOR kernel: Code: 83 c7 01 41 83 ff 40 75 d5 83 c3 40 49 83 c4 08 81 fb 00 01 00 00 75 c1 4c 8b 65 00 e9 8d f6 ff ff 0f 1f 44 00 00 49 8b 56 20 <8b> 72 38 48 8b 55 18 89 34 82 4d 8b 6e 08 4d 85 ed 74 14 4d 89 ee

smartctl is segfaulting.  We will need a disk guru like @JorgeB to take a look.

 

You also have a realtek NIC and that may be causing some issues.  Realtek drivers on Linux are troublesome because they are not updated for each release of Linux.

  • Community Expert

Unraid driver is crashing, this can sometimes be helped by using a different kernel, update to v6.11.0-rc3 to see if it helps.

  • Author

thank you, will give it a try this morning and see what happens.

  • Author

Well I upgraded to 6.11.0-rc3 and set out to add my next disc into the array and began the clear on it.  Only lasted about 1 or 2 hours before the errors and unresponsiveness kicked in again.  Attached are my latest diagnostics.

 

Any further suggestions would be appreciated.

 

-Keelhaulers

thor-diagnostics-20220812-0952.zip

  • Community Expert

Smartctl segfaulting is strange, could be a hardware issue, do you remember if the issues started after an Unraid release upgrade? If there's a known working release downgrade back to it, boot it safe mode, if the issues continue it's likely hardware.

  • Author

Thanks, I was thinking of trying to downgrade.  It's been flaky for awhile.  I will try a pinpoint a date and release that I remember it working "properly" and then try to install that version.

 

I appreciate all the suggestions.

 

-Keelhaulers

  • Author

thank you - will give anything a shot at this point.

 

I've got it downgraded to 6.9.0-rc1 currently and I'm attempting to clear a disk. Will see what happens from here.

 

-Keelhaulers

Edited by Keelhaulers
add info

  • Author

No go on this either. I do keep noticing that the errors state that there is a time sync issue.  So I decided to check on the BIOS, the time is wrong. I reset it, but after about an hour it changes again.  I'm going to change out the CMOS battery and see if I can at least get different results.  I don't have high hopes though.

Edited by Keelhaulers
typos

  • Community Expert

Time should not change with the server on, even with a bad CMOS battery,  board might be going bad.

  • Author

yep, as you suspected. New CMOS battery had no affect. Still getting Kernel Panic.

 

I've wasted enough time with this and trying to fix things. I am just going to pull the trigger on a new MB, CPU, and memory. 

 

Gonna stay away from AMD Ryzen this time and look for a mid range Intel CPU and board.

 

Would like something with at least 3 M.2 slots, the search is on.

 

Any suggestions would be greatly appreciated.  Just hosting Plex, I don't do any gaming...

 

Thanks again to everyone who chimed in.

 

-Keelhaulers

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.