Unraid in semi-hung state

November 14, 2025Nov 14

I'm able to SSH on to the server and the web UI works, but everything else seems to time out. I can't access files etc. across the network, and if I try and trigger any commands via the GUI they just time out as well. Reboot, same. Stop array, same.

I couldn't see anything obvious in /var/log/syslog that indicated any issues. Running top doesn't show anything with high mem or CPU usage, but some of the cores keep spiking/staying at 100% utilisation.

Are there any logs I can provide to work out what might be going on here?

Is there a way I can force the array offline in a 'safe' manner and then trigger a reboot perhaps? Any other thoughts as to how I should approach this?

root@NAS01:~# tail /var/log/syslog

Nov 14 16:06:46 NAS01 emhttpd: Sync filesystems...

Nov 14 16:06:46 NAS01 emhttpd: shcmd (214519): sync

Nov 14 16:06:47 NAS01 sshd-session[2206811]: Postponed keyboard-interactive/pam for root from 192.168.0.132 port 58796 ssh2 [preauth]

Nov 14 16:06:47 NAS01 sshd-session[2206811]: Accepted keyboard-interactive/pam for root from 192.168.0.132 port 58796 ssh2

Nov 14 16:06:47 NAS01 sshd-session[2206811]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)

Nov 14 16:06:47 NAS01 elogind-daemon[1785]: New session 2 of user root.

Nov 14 16:06:47 NAS01 sshd-session[2206811]: User child is on pid 2207105

Nov 14 16:06:47 NAS01 sshd-session[2207105]: Starting session: shell on pts/0 for root from 192.168.0.132 port 58796 id 0

Nov 14 16:09:28 NAS01 nginx: 2025/11/14 16:09:28 [error] 6488#6488: *17943 upstream timed out (110: Connection timed out) while reading upstream, client: 192.168.0.132, server: , request: "POST /update.htm HTTP/2.0", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "nas01.local", referrer: "https://nas01.local/Main"

Nov 14 16:14:28 NAS01 nginx: 2025/11/14 16:14:28 [error] 6488#6488: *17943 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.132, server: , request: "POST /update.htm HTTP/2.0", upstream: "http://unix:/var/run/emhttpd.socket/update.htm", host: "nas01.local", referrer: "https://nas01.local/Settings/DiskSettings"

Edited November 14, 2025Nov 14 by flashback

Quote

November 14, 2025Nov 14

Community Expert

Try to get the complete syslog:

cp /var/log/syslog /boot/syslog.txt

Quote

November 14, 2025Nov 14

Author

5 minutes ago, JorgeB said:
Try to get the complete syslog:
cp /var/log/syslog /boot/syslog.txt

That's done, where's the best place to upload? It's 1.3mb so PasteBin fails (512kb limit).

Edited November 14, 2025Nov 14 by flashback

Quote

November 14, 2025Nov 14

Community Expert

You should be able to attach it to the forum zip fist if it's too large.

Quote

November 14, 2025Nov 14

Author

Attached.

NAS01syslog.zip

Quote

November 14, 2025Nov 14

Community Expert

Nov 14 15:13:44 NAS01 kernel: <TASK>
Nov 14 15:13:44 NAS01 kernel: unraidd+0xe00/0x13d0 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? preempt_latency_start+0x2b/0x50
Nov 14 15:13:44 NAS01 kernel: ? md_thread+0xf1/0x120 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? kthread_should_park+0x12/0x30
Nov 14 15:13:44 NAS01 kernel: md_thread+0xf1/0x120 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? __pfx_autoremove_wake_function+0x10/0x10
Nov 14 15:13:44 NAS01 kernel: ? __pfx_md_thread+0x10/0x10 [md_mod]

Unraid driver crashed. This is almost always a hardware problem. Reboot the server; you likely will need to force it, then post the diagnostics to see the hardware used.

Quote

November 14, 2025Nov 14

Author

10 minutes ago, JorgeB said:

Nov 14 15:13:44 NAS01 kernel: <TASK>
Nov 14 15:13:44 NAS01 kernel: unraidd+0xe00/0x13d0 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? preempt_latency_start+0x2b/0x50
Nov 14 15:13:44 NAS01 kernel: ? md_thread+0xf1/0x120 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? kthread_should_park+0x12/0x30
Nov 14 15:13:44 NAS01 kernel: md_thread+0xf1/0x120 [md_mod]
Nov 14 15:13:44 NAS01 kernel: ? __pfx_autoremove_wake_function+0x10/0x10
Nov 14 15:13:44 NAS01 kernel: ? __pfx_md_thread+0x10/0x10 [md_mod]

Unraid driver crashed. This is almost always a hardware problem. Reboot the server; you likely will need to force it, then post the diagnostics to see the hardware used.

Thanks, I suspected as much as I saw some kernel messages in there. I'm trying to do a pre-reboot dump of diagnostics, it's been running a few minutes now - I assume it shouldn't take that long and therefore may not be able to be grabbed pre-reboot then? I've attached the ZIP it's generated so far, there also seems to be one generated earlier today which I've also attached.

For the force reboot, do I need to just shutdown -r via terminal, or physical reset it using the NAS power button?

fwlonnas01-diagnostics-20251020-1252.zip fwlonnas01-diagnostics-20251020-2004.zip

Quote

November 15, 2025Nov 15

Community Expert

12 hours ago, flashback said:
I assume it shouldn't take that long and therefore may not be able to be grabbed pre-reboot then?

Diags can be after a reboot, just to see the hardware. No hardware is known to have issues AFAIK, only one stick of RRAM, so you cannot test with one at a time, try running memtest, but note that that's only definitive if it finds errors.

Quote

November 15, 2025Nov 15

Author

Just now, JorgeB said:
Diags can be after a reboot, just to see the hardware. No hardware is known to have issues AFAIK, only one stick of RRAM, so you cannot test with one at a time, try running memtest, but note that that's only definitive if it finds errors.

OK, should I just run "/sbin/reboot" from the command line or "powerdown -r"? Searching the forum gave multiple viewpoints. Note that the array is still 'stuck':

NAS01:~# grep -E '^(mdState|fsState)=' /var/local/emhttp/var.ini

mdState="STARTED"

fsState="Stopping"

Edited November 15, 2025Nov 15 by flashback

Quote

November 15, 2025Nov 15

Author

I ran powerdown, which it noted is deprecated but it started the process (like I had also tried from the GUI). That seems to do the same, whereby it says it is rebooting........ but doesn't actually reboot. Looks like I'll need to physically reboot via the power button. Note: I checked /mnt/disk1 and /mnt/disk2 and both seemed to display the data OK using ls -lah.

:/mnt# powerdown -r

powerdown: /usr/local/sbin/powerdown has been deprecated

Broadcast message from root@NAS01 (pts/0) (Sat Nov 15 07:50:38 2025):

The system is going down for reboot NOW!

Quote

November 15, 2025Nov 15

Community Expert

You will likely need to force a reboot.

Quote

November 15, 2025Nov 15

Author

2 minutes ago, JorgeB said:
You will likely need to force a reboot.

Rebooted, syslog and diagnostics attached.

fwlonnas01-diagnostics-20251115-0759.zip syslog20251115.zip

Quote

November 15, 2025Nov 15

Author

I can change settings via the GUI now, so that's working after the reboot. I ran the syslog though ChatGPT as well as having a skim myself, which came out with the following:

Category	Severity	Notes

ACPI BIOS Errors

Medium

Fix with BIOS update if available

Memory Controller (EDAC) Error

Medium–High

Run memory test; monitor ECC events

NVMe Missing SUBNQN

Low

Common, usually harmless

PCI Resource Mapping Warning

Low–Medium

Likely harmless; BIOS update may help

If I start my array now (as I have it set to manual after reboot), I should uncheck write corrections to parity, as a safety measure, right? Or should I not try and start the array yet?

Quote

November 15, 2025Nov 15

Community Expert

Have you already run memtest?

Quote

November 15, 2025Nov 15

Author

12 minutes ago, JorgeB said:
Have you already run memtest?

I can only do that from the boot menu, can't I? If so, it'll need to wait till I'm back in town so I can physically access the machine...

Quote

November 16, 2025Nov 16

Author

I thought I might do what I could as much as possible before I get back, and triggered an online memtest of 29G out of the 32G RAM, results as follows:

[16.11.2025 04:34:40 GMT] /usr/bin/memtester 28G 2
=============================================================================================

memtester version 4.6.0 (64-bit)
adapted for use on Unraid installations

Copyright (C) 2001-2020 Charles Cazabon
Copyright (C) 2024 desertwitch (modifications for Unraid)

THIS PROGRAM IS PROVIDED AS IS AND WITHOUT ANY WARRANTIES
It is licensed under GNU General Public License Version 2

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 28672MB (30064771072 bytes)
got  28672MB (30064771072 bytes), trying mlock ...locked.

Loop 1/2:
  Stuck Address            : Testing...           ok
  Random Value             : Testing...           ok
  Compare XOR              : Testing...           ok
  Compare SUB              : Testing...           ok
  Compare MUL              : Testing...           ok
  Compare DIV              : Testing...           ok
  Compare OR               : Testing...           ok
  Compare AND              : Testing...           ok
  Sequential Increment     : Testing...           ok
  Solid Bits               : Testing...           ok
  Block Sequential         : Testing...           ok
  Checkerboard             : Testing...           ok
  Bit Spread               : Testing...           ok
  Bit Flip                 : Testing...           ok
  Walking Ones             : Testing...           ok
  Walking Zeroes           : Testing...           ok
  8-bit Writes             : Testing...           ok
  16-bit Writes            : Testing...           ok

Loop 2/2:
  Stuck Address            : Testing...           failed
  Random Value             : Testing...           ok
  Compare XOR              : Testing...           ok
  Compare SUB              : Testing...           ok
  Compare MUL              : Testing...           ok
  Compare DIV              : Testing...           ok
  Compare OR               : Testing...           ok
  Compare AND              : Testing...           ok
  Sequential Increment     : Testing...           ok
  Solid Bits               : Testing...           ok
  Block Sequential         : Testing...           ok
  Checkerboard             : Testing...           ok
  Bit Spread               : Testing...           ok
  Bit Flip                 : Testing...           failed
  Walking Ones             : Testing...           ok
  Walking Zeroes           : Testing...           ok
  8-bit Writes             : Testing...           ok
  16-bit Writes            : Testing...           ok

Done.

=============================================================================================
[16.11.2025 07:16:21 GMT] The operation has finished with errors.
[16.11.2025 07:16:21 GMT] Code: 6 - Error during Stuck Address Test + Other Test(s).

And:

[16.11.2025 04:34:40 GMT] /usr/bin/memtester 28G 2
=============================================================================================

The detailed error output is enabled, watch this panel for any occurring errors.

FAILURE: possible bad address line at offset 0x000000036eef1328.
FAILURE: 0xffffffffff7fffff != 0xfffffbffff7fffff at offset 0x00000000fb7b35d8.

What is interesting is the timing of this, I installed this a month ago in my comms room downstairs and it's been running fine. On Wednesday I moved it up to my loft as my sparky came and extended my network up there - it's cooler up there (although at idle the system runs at 18-25 degrees C and the drives around the same), but also the power socket has a max limit of 5A. Hanging off the power socket are the UPS (Eaton 3S850B 3S Gen2 Desktop UPS Uninterruptible Power Supply (510W/850VA)) which connects my 2.5GbE switch and then this NAS. Red herring and just coincidental timing?

In the meantime, should I just power off the NAS till I get back do you think and raise a warranty replacement request in the meantime? Then there's the question around the timing of the relocation, is it possible the lower temps and/or the power running to it are what's maybe caused the issue.

Quote

November 16, 2025Nov 16

Community Expert

2 hours ago, flashback said:
In the meantime, should I just power off the NAS

Yes, you should not run the server with known bad RAM.

Quote

1

November 19, 2025Nov 19

Author

TerraMaster support isn't responding and as Amazon just fulfills for them, they can't provide a straight replacement and will only refund. The price has gone up by £183, which is rather annoying.

I think I may need to look at alternative replacements. In the meantime, what's the best way for me to recover the data from the drives that Unraid was managing? Put them into a USB caddy, and then is there a way to read in Windows or will I need to boot up into a Linux live environment or similar due to XFS?

Quote

November 19, 2025Nov 19

Community Expert

You can just boot with an Unraid trial flash drive in any PC to access them, another Linux distro will also work, like an Ubuntu live flash drive

Quote

November 19, 2025Nov 19

Author

9 minutes ago, JorgeB said:
You can just boot with an Unraid trial flash drive in any PC to access them, another Linux distro will also work, like an Ubuntu live flash drive

I might have to just try and grab the data not copied (it's only 50-100gb that hadn't backed up by the time it started failing) by booting it back up, as I don't have another PC to fire up - only laptops.

Quote

Unraid in semi-hung state

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)