Sten3danny Posted July 16 Share Posted July 16 So I have this issue where my Unraid machine (6.12.10) starts freezing up after 1 - x days. The GUI and array still sort of work but CPU usage is pegged at 100% and dockers and VMs are no longer responsive. I am pretty sure I have nailed the issue down to an SSD that I have running as a pool for my VMs. At first, the issue occured several times after about one day when the SSD attempted to go into suspend mode. After setting the spin down delay to 'never' I thought I had fixed the issue, as the machine was stable for over 6 days. However, this morning the issue reappeared. The machine was 'frozen' and the SSD showed a * where the temperature would normally be displayed, so apparently it still went into sleep mode or became unavailable in some other manner. I remember having this issue with another SSD a year or two ago, but back then setting the spin down delay to 'never' fixed the issue for me. This time I am not so lucky apparently. Is this known behaviour for Unraid, has anyone seen this as well? The SSD in question is a Kingspec 2TB m.2 SATA SSD, formatted with ZFS that is in a single pool. I might replace it for a better known brand if I knew that would fix the issue. I forgot to copy the log before rebooting the machine, but if it freezes up again (don't hope so) I can add it here if that would help in identifying the issue. For now I am just wondering if this is a known issue, and if there is an (easy) fix for it. Thx a lot for any thoughts! Regards, Danny Quote Link to comment
JorgeB Posted July 16 Share Posted July 16 26 minutes ago, Sten3danny said: but if it freezes up again (don't hope so) I can add it here if that would help in identifying the issue. Do that. Quote Link to comment
Sten3danny Posted July 16 Author Share Posted July 16 Unfortunately I did not have to wait long, I just noticed the issue occurred again. I have attached the syslog, not sure if it holds any relevant information, the error messages don't mean that much to me. I would really appreciate any insights! :) olifant-syslog-20240716-1831.zip Quote Link to comment
JorgeB Posted July 17 Share Posted July 17 Jul 16 18:40:15 Olifant kernel: WARNING: Pool 'kingspec' has encountered an uncorrectable I/O failure and has been suspended. This device dropped offline and since there's no redundancy zfs suspends the pool, and this will make you unable to stop the array and do a clean reboot, start by replacing the cables for that device. Quote Link to comment
Sten3danny Posted July 17 Author Share Posted July 17 Hi JorgeB, thx for your reply. It is a m.2 drive plugged directly into the motherboard, so no cables to replace. Could it be that it is just a bad quality drive? If so, I might try replacing it. Alternatively, I was thinking to reformat the drive to XFS, because I remember reading that ZFS is not recommended for a single device pool. Or is that not the case any more? What are your thoughts? Quote Link to comment
Solution JorgeB Posted July 17 Solution Share Posted July 17 59 minutes ago, Sten3danny said: Could it be that it is just a bad quality drive? It's possible, you can try swapping/using a different m.2 slot if available. 1 hour ago, Sten3danny said: Alternatively, I was thinking to reformat the drive to XFS, because I remember reading that ZFS is not recommended for a single device pool. Or is that not the case any more? It's not not recommended, but for a single device pool, and if you don't care about checksums or snapshots, xfs is fine, but the issue was not caused by zfs, it just adds the not able to stop the array problem, that won't happen with btrfs or xfs. Quote Link to comment
Sten3danny Posted July 17 Author Share Posted July 17 Unfortunately, I have no empty m.2 slots to swap it to. I will try to reformat to xfs because I don't really NEED snapshots. If that doesn't fix the issue, I will probably replace the drive. Thanks for your help JorgeB! Oh, one last question: when changing the FS, should I use the 'Erase' function? I am not really sure what that does and couldn't find it in the manual.. Quote Link to comment
JorgeB Posted July 17 Share Posted July 17 11 minutes ago, Sten3danny said: should I use the 'Erase' function? You can, that's the easiest way. Quote Link to comment
Sten3danny Posted July 17 Author Share Posted July 17 Thx, I will go ahead with that then.. Quote Link to comment
Sten3danny Posted July 27 Author Share Posted July 27 So, ten days in after changing the filesystem to XFS and the issue hasn't reoccured (yet). Not sure if I've just been lucky so far, or if this has actually 'fixed' the issue, or if there are any other variables at play here, but I am happy for now 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.