sirebral Posted August 7, 2021 Share Posted August 7, 2021 Hi Guys, I have a SuperMicro X9DRi-LN4+/X9DR3-LN4+ server that's on the latest Unraid. It seems to run fine during the day, yet when I wake up it's almost always crashed and the disks are in a weird state. This morning it showed all the disks mounted on the top of main, yet also showed them all in available devices. The box has a data array of 7 spinning disks (6+1 parity) on XFS. There's 5 18tb spinning disks at 18tb each and 2 at 4tb. I also then run a BTRFS cache pool with 4 devices. 2xXeon's and 196gb of DDR3 ECC at 1066. The BIOS has conservative settings, no overclocking or anything odd. I am using an IT mode flashed HBA to drive the array. I've attached the hardware dump and the diags. I'm hopeful someone can tell me what's going on, as I can't use this system until this problem has been resolved. Luckily I haven't encountered any system-disabling corruption as yet, but if it keeps dying I don' t think that trend will continue. I'm keeping all of my valuable data offline at the point, waiting until this system is stable. Most the storage is rather new, the server is rather old, yet all firmware is up-to-date (that I could find). I also have a RAID controller that I could put in. It has JBOD and battery backup, yet I'm thinking the less complex card is better. Looking forward to your help, ` as I'd really like to complete this box, so I can start on a 3 way cluster that's been waiting in the wings. Thank you! Keith hardwarre.txt media-diagnostics-20210807-1402.zip Quote Link to comment
trurl Posted August 8, 2021 Share Posted August 8, 2021 Looks like maybe controller problems, or possibly power since multiple disks are affected. How are cache drives connected? Quote Link to comment
sirebral Posted August 8, 2021 Author Share Posted August 8, 2021 (edited) Cache is on the same controller, all disks share an extended 12 port SAS/SATA backplane. Edited August 8, 2021 by sirebral Quote Link to comment
sirebral Posted August 9, 2021 Author Share Posted August 9, 2021 (edited) Ran an overnight diag that stressed the disk, CPU, Memory, and Graphics. Not one error so I conclude there's some sort of issue with the included drivers and my system OR since it happens overnight, every night, it may be a bad plugin or conflict. Can the UNRAID team take a look? In the meantime I'm going to try TrueNAS I suppose. Interesting sub-fact, while I was backing everythig up last night, had the array online but disabled Docker and all recurring plugins, and it ran all night with heavy usage (rsync of several hundred TB with URAID running) and no crashes at all. Edited August 9, 2021 by sirebral Quote Link to comment
JorgeB Posted August 11, 2021 Share Posted August 11, 2021 Enable syslog mirror to flash then post that after a crash. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.