October 22, 2025Oct 22 Hi everyone,I'd like some support on a weird bug that my unraid NAS box has faced the last couple of days.I got notified about my SMB shares not being accessible anymore around monday afternoon, when it was working fine on the sunday previously.The setup:Ryzen 7-3700x, Asus ROG STRIX B550-E GAMING, 16GB Ram, 3 HDDs, a 16TB parity Seagate, 2x 8Tb WD REDs, 1x 1TB cache NVMERunning 7.1.4 the last month or so ? Idk when I updated, but it was seamless and worked perfectly until now.What happens: The computer boots up normally, all the way until I log-in and click on "Main", and mount the array.Starting the array makes the entire system hanging, and unresponsive afterwards. Impossible to navigate to other tabs like "Shares" or "Tools", the shares are not mounted and can not be accessed, and the host is unresponsive to pings afterwards. A static IP is of course set up as it was the case in the last years.I caught a couple of weird errors in syslog (because ofc I'll try to figure out the issue), and these were in the syslog:Oct 20 18:08:07 sunflowers kernel: mce: [Hardware Error]: Machine check events logged Oct 20 18:08:07 sunflowers kernel: [Hardware Error]: Corrected error, no action required. Oct 20 18:08:07 sunflowers kernel: [Hardware Error]: CPU:0 (17:71:0) MC25_STATUS[-|CE|MiscV|-|-|-|-|CECC|-|-|-]: 0x98004000003e0000 Oct 20 18:08:07 sunflowers kernel: [Hardware Error]: IPID: 0x000100ff03830400 Oct 20 18:08:07 sunflowers kernel: [Hardware Error]: Platform Security Processor Ext. Error Code: 62 Oct 20 18:08:07 sunflowers kernel: [Hardware Error]: cache level: RESV, tx: INSNGoogled that error, figured out I needed a BIOS update as the last one dated 2020. I did so, and installed ASUS's latest stable BIOS firmware without issues. Now it is running the 2025-01-13 update. Tried to mount the array again, was faced with the virtualization error which I forgot to do in the newly updated BIOS (ERROR: could not insert 'kvm_amd': Operation not supported), which was promptly corrected.Now though, each time I try mounting the array I have to manually shut down the server, and launch it again to access the WebUI. Mounting the array in maintenance mode works, but in each case, the diagnostics, logs, show no error so I am stumped as to how to solve this issue. Please lend me an ear and help me figure out what might have made this very stable staple of my infrastructure bug out with no obvious changes or different set of circumstances when I was running it for months and months with no problems.Attached are diagnostics from my last reboot, and the previous syslogs I thought about keeping when I got the first error about the hardwareBest,sunflowers-diagnostics-20251022-2026.zip syslog-previous.txt Edited October 23, 2025Oct 23 by aytsuqi
October 23, 2025Oct 23 Community Expert Solution Strange that there's nothing out of the ordinary logged, other than the mce, but try this, change the filesystem to a different one for all disks and pools, so that they don't mount (of course, don't format them), then start in normal mode. If it doesn't hang, revert the filesystem for just one device and retest (start with the pool), then keep reverting the other ones.
October 23, 2025Oct 23 Author hey JorgeB, Can't find the option to change file systems on these disks ; after the reboot, when you're on Main I only have the option to deselect the drive, in which case it shows as missing with the red cross ; when you select it again, not once am I prompted to select the fs.What I mean to say, is that there is no dropdown menu for the fs partAny ideas ?
October 23, 2025Oct 23 Community Expert you need to click on the device name (e.g. Disk1) to bring up the required dialog.
October 23, 2025Oct 23 Author Okay I am an idiot itimpi, thanks for your help! Now that I had this guidance, I managed to follow JorgeB's instructions: seems like my problem is on the cache disk (which is very relieving since the HDDs are so much more expensive, and have data!). The Array starts without issues with the cache being unmountable.Turns out the culprit is the SSD. A quick SMART scan after, here are the results: A bit confusing no? "Completed without error" but still some showing on the log? Is the drive dead?EDIT: Would changing the FS and erasing it help ? Edited October 23, 2025Oct 23 by aytsuqi
October 23, 2025Oct 23 Community Expert 56 minutes ago, aytsuqi said:EDIT: Would changing the FS and erasing it help ?It's worth a try if you don't mind losing that data.
October 23, 2025Oct 23 Author Erased cache disk, changed it to xfs, formatted it, stopped the array again, changed fs back to btrfs, erased it once again ; formatted it once more. The array now mounts without issues. Weird how SMART passes, and how there was no other indication in the logs of a possible error, so although I am happy with the fact this is solved, I wonder what the cause might have been.Every disk still shows 0 error in "Main"I'll still mark this as solved, thank you very much JorgeB and itimpi for the advice!
October 23, 2025Oct 23 Author If you are indeed talking about the very long process you can select when GRUB comes, under "Unraid OS", "Unraid OS GUI" and all, I have done it once last year or so, but not since that moment.Would there be any correlation between RAM and the cache SSD ? My system ran through the booting process manyyyyyy times while I was trying to troubleshoot that without any issues
October 23, 2025Oct 23 Community Expert There's a possibility that bad RAM corrupted the filesystem, it's more likely that than a device problem, so it may be worth running it again.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.