November 8, 20241 yr I notice that my server becomes unresponsive in less than a day It is a new setup I know that my hard drive are oscillating between 44-46 C creating a ton of drive temp warning as the threshold was 45 Sometimes I am able to ping, load some docker web ui, but loading Unraid main UI seems to be the cause from time to time I am running tdarr server with nodes on other computer creating lots of IOs on the drives added the 2 last syslog I got syslog-previous syslog-previous (1)
November 8, 20241 yr Community Expert Multiple call traces and segfaults logged, looks more like a hardware issue.
November 12, 20241 yr Author Just to keep updated So far memtest86 passed SMART test passed on all disk Tried to run badblocks on Unraid, but Unraid crashed within an hour twice Trying badblocks with Ubuntu live USB Might try some CPU benchmark, if no issues found
November 14, 20241 yr Author So far badblocks running on Ubuntu Live USB for 45+ hours, no error so far Only thing is a card I bought to use some M.2 as cache and passthrough VM, but no drive were used by Unraid. Dual M.2 PCIE Adapter Card, 4 Port M.2 NVMe SSD to PCIE X16 M Key Hard Drive Converter Reader Expansion Card PC Internal Expansion Card(ph44) https://a.co/d/95V10yb
November 16, 20241 yr Author Solution Final update No bad hardware, but bad hardware configuration My understanding is that the PCIe M.2 adapter card requires PCIe bifurcation, and it is not supported on a Z420. This cause smartctl to crash in some way (it works on the 1 drive present, I suppose that it crashes on the empty controller), that causing Unraid to crash within a day Forcing SMART controller type to ATA instead of auto allow the system to run, as Smart will just not run on NVME. This fix my issue until I replace the PCIe card later this week
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.