-
Unraid in semi-hung state
I might have to just try and grab the data not copied (it's only 50-100gb that hadn't backed up by the time it started failing) by booting it back up, as I don't have another PC to fire up - only laptops.
-
Unraid in semi-hung state
TerraMaster support isn't responding and as Amazon just fulfills for them, they can't provide a straight replacement and will only refund. The price has gone up by £183, which is rather annoying. I think I may need to look at alternative replacements. In the meantime, what's the best way for me to recover the data from the drives that Unraid was managing? Put them into a USB caddy, and then is there a way to read in Windows or will I need to boot up into a Linux live environment or similar due to XFS?
-
[PLUGIN] Live Memory Tester for UNRAID
Thanks all - it's shutdown, so I'll deal with the joy that is Amazon for a replacement of the unit.
-
flashback started following [PLUGIN] Live Memory Tester for UNRAID
-
[PLUGIN] Live Memory Tester for UNRAID
I had a recent kernel hang and as I'm not able to physically able to get to the device for a few days, stumbled across this as a way of initial investigation. Great tool, thanks! [16.11.2025 04:34:40 GMT] /usr/bin/memtester 28G 2 ============================================================================================= The detailed error output is enabled, watch this panel for any occurring errors. FAILURE: possible bad address line at offset 0x000000036eef1328. FAILURE: 0xffffffffff7fffff != 0xfffffbffff7fffff at offset 0x00000000fb7b35d8. What is interesting is the timing of this, I installed this new NAS a month ago in my comms room downstairs and it's been running fine. On Wednesday I moved it up to my loft as my sparky came and extended my network up there - it's cooler up there (although at idle the system runs at 18-25 degrees C and the drives around the same), but also the power socket has a max limit of 5A. Hanging off the power socket are the UPS (Eaton 3S850B 3S Gen2 Desktop UPS Uninterruptible Power Supply (510W/850VA)) which connects my 2.5GbE switch and then this NAS. Is this just a red herring and just coincidental timing? In the meantime, should I just power off the NAS till I get back and raise a warranty replacement request in the meantime? Then there's the question around the timing of the relocation, is it possible the lower temps and/or the power running to it are what's maybe caused the issue. If so, a warranty replacement may not help and the same issue might reoccur if it's either temp or power that is/has caused the issue.
-
Unraid in semi-hung state
I thought I might do what I could as much as possible before I get back, and triggered an online memtest of 29G out of the 32G RAM, results as follows: [16.11.2025 04:34:40 GMT] /usr/bin/memtester 28G 2 ============================================================================================= memtester version 4.6.0 (64-bit) adapted for use on Unraid installations Copyright (C) 2001-2020 Charles Cazabon Copyright (C) 2024 desertwitch (modifications for Unraid) THIS PROGRAM IS PROVIDED AS IS AND WITHOUT ANY WARRANTIES It is licensed under GNU General Public License Version 2 pagesize is 4096 pagesizemask is 0xfffffffffffff000 want 28672MB (30064771072 bytes) got 28672MB (30064771072 bytes), trying mlock ...locked. Loop 1/2: Stuck Address : Testing... ok Random Value : Testing... ok Compare XOR : Testing... ok Compare SUB : Testing... ok Compare MUL : Testing... ok Compare DIV : Testing... ok Compare OR : Testing... ok Compare AND : Testing... ok Sequential Increment : Testing... ok Solid Bits : Testing... ok Block Sequential : Testing... ok Checkerboard : Testing... ok Bit Spread : Testing... ok Bit Flip : Testing... ok Walking Ones : Testing... ok Walking Zeroes : Testing... ok 8-bit Writes : Testing... ok 16-bit Writes : Testing... ok Loop 2/2: Stuck Address : Testing... failed Random Value : Testing... ok Compare XOR : Testing... ok Compare SUB : Testing... ok Compare MUL : Testing... ok Compare DIV : Testing... ok Compare OR : Testing... ok Compare AND : Testing... ok Sequential Increment : Testing... ok Solid Bits : Testing... ok Block Sequential : Testing... ok Checkerboard : Testing... ok Bit Spread : Testing... ok Bit Flip : Testing... failed Walking Ones : Testing... ok Walking Zeroes : Testing... ok 8-bit Writes : Testing... ok 16-bit Writes : Testing... ok Done. ============================================================================================= [16.11.2025 07:16:21 GMT] The operation has finished with errors. [16.11.2025 07:16:21 GMT] Code: 6 - Error during Stuck Address Test + Other Test(s). And: [16.11.2025 04:34:40 GMT] /usr/bin/memtester 28G 2 ============================================================================================= The detailed error output is enabled, watch this panel for any occurring errors. FAILURE: possible bad address line at offset 0x000000036eef1328. FAILURE: 0xffffffffff7fffff != 0xfffffbffff7fffff at offset 0x00000000fb7b35d8. What is interesting is the timing of this, I installed this a month ago in my comms room downstairs and it's been running fine. On Wednesday I moved it up to my loft as my sparky came and extended my network up there - it's cooler up there (although at idle the system runs at 18-25 degrees C and the drives around the same), but also the power socket has a max limit of 5A. Hanging off the power socket are the UPS (Eaton 3S850B 3S Gen2 Desktop UPS Uninterruptible Power Supply (510W/850VA)) which connects my 2.5GbE switch and then this NAS. Red herring and just coincidental timing? In the meantime, should I just power off the NAS till I get back do you think and raise a warranty replacement request in the meantime? Then there's the question around the timing of the relocation, is it possible the lower temps and/or the power running to it are what's maybe caused the issue.
-
Unraid in semi-hung state
I can only do that from the boot menu, can't I? If so, it'll need to wait till I'm back in town so I can physically access the machine...
-
Unraid in semi-hung state
I can change settings via the GUI now, so that's working after the reboot. I ran the syslog though ChatGPT as well as having a skim myself, which came out with the following: Category Severity Notes ACPI BIOS Errors Medium Fix with BIOS update if available Memory Controller (EDAC) Error Medium–High Run memory test; monitor ECC events NVMe Missing SUBNQN Low Common, usually harmless PCI Resource Mapping Warning Low–Medium Likely harmless; BIOS update may help If I start my array now (as I have it set to manual after reboot), I should uncheck write corrections to parity, as a safety measure, right? Or should I not try and start the array yet?
-
Unraid in semi-hung state
Rebooted, syslog and diagnostics attached. fwlonnas01-diagnostics-20251115-0759.zipsyslog20251115.zip
-
Unraid in semi-hung state
I ran powerdown, which it noted is deprecated but it started the process (like I had also tried from the GUI). That seems to do the same, whereby it says it is rebooting........ but doesn't actually reboot. Looks like I'll need to physically reboot via the power button. Note: I checked /mnt/disk1 and /mnt/disk2 and both seemed to display the data OK using ls -lah. :/mnt# powerdown -r powerdown: /usr/local/sbin/powerdown has been deprecated Broadcast message from root@NAS01 (pts/0) (Sat Nov 15 07:50:38 2025): The system is going down for reboot NOW!
-
[SOLVED] Commands to start and stop array
If my array is stuck in the stopping state (been 12 hours now since I tried to stop it), am I at risk of data loss if I force a reboot? :~# grep -E '^(mdState|fsState)=' /var/local/emhttp/var.ini mdState="STARTED" fsState="Stopping"
-
Unraid in semi-hung state
OK, should I just run "/sbin/reboot" from the command line or "powerdown -r"? Searching the forum gave multiple viewpoints. Note that the array is still 'stuck': NAS01:~# grep -E '^(mdState|fsState)=' /var/local/emhttp/var.ini mdState="STARTED" fsState="Stopping"
-
Unraid in semi-hung state
Thanks, I suspected as much as I saw some kernel messages in there. I'm trying to do a pre-reboot dump of diagnostics, it's been running a few minutes now - I assume it shouldn't take that long and therefore may not be able to be grabbed pre-reboot then? I've attached the ZIP it's generated so far, there also seems to be one generated earlier today which I've also attached. For the force reboot, do I need to just shutdown -r via terminal, or physical reset it using the NAS power button? fwlonnas01-diagnostics-20251020-1252.zip fwlonnas01-diagnostics-20251020-2004.zip
-
Unraid in semi-hung state
Attached. NAS01syslog.zip
-
Unraid in semi-hung state
That's done, where's the best place to upload? It's 1.3mb so PasteBin fails (512kb limit).
-
Unraid in semi-hung state
I'm able to SSH on to the server and the web UI works, but everything else seems to time out. I can't access files etc. across the network, and if I try and trigger any commands via the GUI they just time out as well. Reboot, same. Stop array, same. I couldn't see anything obvious in /var/log/syslog that indicated any issues. Running top doesn't show anything with high mem or CPU usage, but some of the cores keep spiking/staying at 100% utilisation. Are there any logs I can provide to work out what might be going on here? Is there a way I can force the array offline in a 'safe' manner and then trigger a reboot perhaps? Any other thoughts as to how I should approach this? root@NAS01:~# tail /var/log/syslog Nov 14 16:06:46 NAS01 emhttpd: Sync filesystems... Nov 14 16:06:46 NAS01 emhttpd: shcmd (214519): sync Nov 14 16:06:47 NAS01 sshd-session[2206811]: Postponed keyboard-interactive/pam for root from 192.168.0.132 port 58796 ssh2 [preauth] Nov 14 16:06:47 NAS01 sshd-session[2206811]: Accepted keyboard-interactive/pam for root from 192.168.0.132 port 58796 ssh2 Nov 14 16:06:47 NAS01 sshd-session[2206811]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0) Nov 14 16:06:47 NAS01 elogind-daemon[1785]: New session 2 of user root. Nov 14 16:06:47 NAS01 sshd-session[2206811]: User child is on pid 2207105 Nov 14 16:06:47 NAS01 sshd-session[2207105]: Starting session: shell on pts/0 for root from 192.168.0.132 port 58796 id 0 Nov 14 16:09:28 NAS01 nginx: 2025/11/14 16:09:28 [error] 6488#6488: *17943 upstream timed out (110: Connection timed out) while reading upstream, client: 192.168.0.132, server: , request: "POST /update.htm HTTP/2.0", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "nas01.local", referrer: "https://nas01.local/Main" Nov 14 16:14:28 NAS01 nginx: 2025/11/14 16:14:28 [error] 6488#6488: *17943 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.0.132, server: , request: "POST /update.htm HTTP/2.0", upstream: "http://unix:/var/run/emhttpd.socket/update.htm", host: "nas01.local", referrer: "https://nas01.local/Settings/DiskSettings"
flashback
Members
-
Joined