Jump to content

6.10.3: Disk errors when using new mainboard, everything fine on old mainboard


Go to solution Solved by Born8bit,

Recommended Posts

Hi, I didn't know how to put this in a real short title and yet preserve the issue, so here's the long version:

 

I have a running system, specs are:

Z97 ASUS mainboard with Xeon CPU

24GB RAM

Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 (DELL H310)

4+1 disk Array (xfs)

1 SSD cache (btrfs)

2 HDD pool (btrfs)

Been using this setup for years, there are no crashes and no issues.

 

nexus-diagnostics-working-xeon-20220724-1401.zip 

 

But now I have my old Ryzen+Mainboard sitting idle, so I'm thinking that's the chance to upgrade:

B450 MSI mainboard, Ryzen 5 1600AF (2600)

16 GB RAM

 

I tested this setup with retired HDDs (onboard SATA) and a trial USB key, ran for days, Quadro passed through for VM, all fine.

 

But as soon as I merge both systems (except the old test-drives, that is), bit hits the fan, see attached log. The cache pool HDDs go first, then the array, all drives end up in unassigned devices although cannot be mounted (grey button "Array"). Given enough idle time, they even get disabled, had to redo the array at this point, although no data was lost and  ...

... when I undo and use the Xeon system, everything returns to normal - and I have no clue what that is.

I can start the system without array, mount drives as unassigned, no issues, but once I proper start it that's it.

 

nexus-diagnostics-failing-ryzen-20220724-1319.zip

 

Just when things are weird, here's another one to make it even better:

 

 I made this test setup for science:

B450 M/B + Ryzen

LSI Controller

Some nVME SSD for Cache

2 old retired HHDs (same as above)

 

Launched it, and it ran just fine. So the old system works, the new system works without drives, with other drives, with LSI controller and other drives, just not with my drives.

Can this be caused by any Plugin or VM, although I am pretty sure I tried safe mode also at some point.

 

Will gladly take any hint, at this point I am sticking to the old system till it dies and then who knows.

Edited by Born8bit
Moved the diagnostic files to proper place
Link to comment
Jul 24 13:18:49 Nexus kernel: sd 10:0:0:0: [sdc] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:0:0: [sdc] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Jul 24 13:18:49 Nexus kernel: sd 10:0:1:0: [sdd] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:1:0: [sdd] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Jul 24 13:18:49 Nexus kernel: sd 10:0:2:0: [sde] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:2:0: [sde] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Jul 24 13:18:49 Nexus kernel: sd 10:0:3:0: [sdf] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:3:0: [sdf] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Jul 24 13:18:49 Nexus kernel: sd 10:0:4:0: [sdg] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:4:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK
Jul 24 13:18:49 Nexus kernel: sd 10:0:5:0: [sdh] Synchronizing SCSI cache
Jul 24 13:18:49 Nexus kernel: sd 10:0:5:0: [sdh] Synchronize Cache(10) failed: Result: hostbyte=0x01 driverbyte=DRIVER_OK

 

Multiple disks dropped at the same time, this suggests a power/connection problem, could also be an issue with the HBA, though much less likely.

Link to comment

As I said, the rest of the system (PSU, HBA, cables, HDD & SSD drives) remains the same, only the Mainboard/CPU/RAM was swapped. When mounting all drives as unassigned devices there were no issues either, this happens right the moment I hit the "Start Array" button. Swapping back to the old Mainboard, all is well.

This new Mainboard/CPU/RAM was my daily system for about a year, it just works(tm), and it worked as a test setup.

If I didn't have the stable system to compare with, I might come to the same conclusion, but I have, and I am not going easy on it so it is "burnt in". PSU is a 600W beQuiet modular, if that helps.

That's the main issue, on their own, all parts work, but can't be combined.

Link to comment

Sadly this board only has two x16 slots, and this wouldn't explain why I can access files on disks attached to this HBA when mounted as unassigned and why I could use this controller with a test rig with the old HDDs.

These are things I had in mind first, also, but they don't explain other observations.

Link to comment
  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...