untraceablez Posted December 1, 2020 Share Posted December 1, 2020 (edited) Hello Everyone, So I just relocated my server this morning from the living room to my office to clear up some space, and after plugging everything back in and booting up, all of my SATA drives are missing. I have the following drives, all the HDDs are missing, but my 2 SSDs show up fine. Parity - 10 TB Seagate IronWolf Pro Disk 1 - 10 TB Seagate IronWolf Pro Disk 2 - 10 TB Seagate IronWolf Pro Disk 3 - 4 TB Seagate IronWolf Disk 4 - 4 TB Seagate IronWolf Cache 1. 500GB Samsung 970 EVO M.2 nVME 2. 500GB Samsung 970 EVO M.2 nVME I scanned the system log but it only shows that the system failed to see the devices. I've checked all the cabling, and even reseated every SATA data and power connection, in the same slots as they were before, despite nothing seeming to have come undone. The move was maybe 14 feet at most across a flat surface, so I highly doubt it's real damage to the drives, especially given 3 of them are relatively new (less than 3 months old). I have my diagnostics zip attached. Any info or ideas would be fantastic, I'm hoping I don't have to replace 3 month old drives, or wipe the array. smaugs-trove-diagnostics-20201201-0723.zip Edited December 1, 2020 by untraceablez Quote Link to comment
JorgeB Posted December 1, 2020 Share Posted December 1, 2020 Both 4TB drives are being detected but fail to initialize, e.g. : Dec 1 07:18:18 Smaugs-Trove kernel: scsi 6:0:0:0: Direct-Access ATA ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5 ... Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00 Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Stopping disk Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00 Dec 1 07:18:18 Smaugs-Trove kernel: ata6.00: disabled Don't see any sign of the 10TB drives, this to me looks more like a power problem. 1 Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 3 minutes ago, JorgeB said: Both 4TB drives are being detected but fail to initialize, e.g. : Dec 1 07:18:18 Smaugs-Trove kernel: scsi 6:0:0:0: Direct-Access ATA ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5 ... Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00 Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Stopping disk Dec 1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00 Dec 1 07:18:18 Smaugs-Trove kernel: ata6.00: disabled Don't see any sign of the 10TB drives, this to me looks more like a power problem. Do you mean like a bad SATA power connector, not enough power from the PSU, or from the wall? Note it is plugged into a APC UPS unit that it was plugged into in the other room. Quote Link to comment
trurl Posted December 1, 2020 Share Posted December 1, 2020 PSU Do you have any splits? 1 Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 1 minute ago, trurl said: PSU Do you have any splits? I have 3 cables running to the various drives from the PSU, using 2 connectors on 2 of the cables, and 1 cable runs exclusively to 1 drive, just due to length issues in the case. The PSU is an 850W modular unit from Corsair. Quote Link to comment
trurl Posted December 1, 2020 Share Posted December 1, 2020 Did you check all the power connections? 1 Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 Just now, trurl said: Did you check all the power connections? Yeah, checked all the connections on the drives already, that was my first trick after a reboot. I've seen stories of bad cables on other threads but I have a feeling that I wouldn't have 5 SATA PSU connectors all fail at exactly the same time. Quote Link to comment
trurl Posted December 1, 2020 Share Posted December 1, 2020 26 minutes ago, untraceablez said: checked all the connections on the drives already Since it is modular PSU the connections at the drive end are not the only connections to be checked. And also any power splitters. 1 Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 1 hour ago, trurl said: Since it is modular PSU the connections at the drive end are not the only connections to be checked. And also any power splitters. Just got done re-assembling the PC, I have to take the motherboard tray out entirely to get to the PSU, I checked all connections, and actually minimized the number of SATA cables to just 2 cables. The drives show up in BIOS, so I'm not sure why they're not showing up in UNRAID at this point. I've attached a new copy of the log, just in case a new clue lies within since I physically checked and re-seated everything. smaugs-trove-diagnostics-20201201-0723.zip Quote Link to comment
JorgeB Posted December 1, 2020 Share Posted December 1, 2020 You have multiple SATA controllers bound to vfio-pci: 09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] Kernel driver in use: vfio-pci Kernel modules: ahci 0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] Kernel driver in use: vfio-pci Kernel modules: ahci Doesn't explain the 4TB disks, but probably does the other missing ones, unbind them and post new diags. 1 Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 26 minutes ago, JorgeB said: You have multiple SATA controllers bound to vfio-pci: 09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] Kernel driver in use: vfio-pci Kernel modules: ahci 0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51) Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] Kernel driver in use: vfio-pci Kernel modules: ahci Doesn't explain the 4TB disks, but probably does the other missing ones, unbind them and post new diags. Well, unattaching the controllers brought all of the drives back, including the 4TB ones. I'm beyond stumped as to why that did it, but hey, it worked! I seriously appreciate the help both of you have given me today @JorgeB and @trurl. 1 Quote Link to comment
JorgeB Posted December 1, 2020 Share Posted December 1, 2020 You're welcome, but no way the bound controllers were the result of moving the server, why I didn't look at that first, though still strange about the 4TB disks. Quote Link to comment
untraceablez Posted December 1, 2020 Author Share Posted December 1, 2020 1 hour ago, JorgeB said: You're welcome, but no way the bound controllers were the result of moving the server, why I didn't look at that first, though still strange about the 4TB disks. I don't know how I didn't encounter issues with it earlier, I'd rebooted a few times before the move and never encountered an issue. Sometimes servers work in mysterious ways... Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.