[SOLVED] Moved Rooms, Drives missing


Recommended Posts

Hello Everyone,

 

So I just relocated my server this morning from the living room to my office to clear up some space, and after plugging everything back in and booting up, all of my SATA drives are missing. I have the following drives, all the HDDs are missing, but my 2 SSDs show up fine. 

 

Parity - 10 TB Seagate IronWolf Pro

Disk 1 - 10 TB Seagate IronWolf Pro

Disk 2 - 10 TB Seagate IronWolf Pro

Disk 3 - 4 TB Seagate IronWolf

Disk 4 - 4 TB Seagate IronWolf

 

Cache

 

1. 500GB Samsung 970 EVO M.2 nVME

2. 500GB Samsung 970 EVO M.2 nVME

 

I scanned the system log but it only shows that the system failed to see the devices. I've checked all the cabling, and even reseated every SATA data and power connection, in the same slots as they were before, despite nothing seeming to have come undone. The move was maybe 14 feet at most across a flat surface, so I highly doubt it's real damage to the drives, especially given 3 of them are relatively new (less than 3 months old). I have my diagnostics zip attached. Any info or ideas would be fantastic, I'm hoping I don't have to replace 3 month old drives, or wipe the array.

smaugs-trove-diagnostics-20201201-0723.zip

Edited by untraceablez
Link to comment

Both 4TB drives are being detected but fail to initialize, e.g. :

Dec  1 07:18:18 Smaugs-Trove kernel: scsi 6:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
...
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Stopping disk
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
Dec  1 07:18:18 Smaugs-Trove kernel: ata6.00: disabled

Don't see any sign of the 10TB drives, this to me looks more like a power problem.
 

 

  • Like 1
Link to comment
3 minutes ago, JorgeB said:

Both 4TB drives are being detected but fail to initialize, e.g. :


Dec  1 07:18:18 Smaugs-Trove kernel: scsi 6:0:0:0: Direct-Access     ATA      ST4000VN008-2DR1 SC60 PQ: 0 ANSI: 5
...
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Stopping disk
Dec  1 07:18:18 Smaugs-Trove kernel: sd 5:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
Dec  1 07:18:18 Smaugs-Trove kernel: ata6.00: disabled

Don't see any sign of the 10TB drives, this to me looks more like a power problem.
 

 

Do you mean like a bad SATA power connector, not enough power from the PSU, or from the wall? Note it is plugged into a APC UPS unit that it was plugged into in the other room. 

Link to comment
1 hour ago, trurl said:

Since it is modular PSU the connections at the drive end are not the only connections to be checked. And also any power splitters.

Just got done re-assembling the PC, I have to take the motherboard tray out entirely to get to the PSU, I checked all connections, and actually minimized the number of SATA cables to just 2 cables. The drives show up in BIOS, so I'm not sure why they're not showing up in UNRAID at this point. I've attached a new copy of the log, just in case a new clue lies within since I physically checked and re-seated everything.

smaugs-trove-diagnostics-20201201-0723.zip

Link to comment

You have multiple SATA controllers bound to vfio-pci:

09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
    Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
    Kernel driver in use: vfio-pci
    Kernel modules: ahci
0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
    Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
    Kernel driver in use: vfio-pci
    Kernel modules: ahci

Doesn't explain the 4TB disks, but probably does the other missing ones, unbind them and post new diags.

  • Thanks 1
Link to comment
26 minutes ago, JorgeB said:

You have multiple SATA controllers bound to vfio-pci:


09:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
    Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
    Kernel driver in use: vfio-pci
    Kernel modules: ahci
0a:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
    Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
    Kernel driver in use: vfio-pci
    Kernel modules: ahci

Doesn't explain the 4TB disks, but probably does the other missing ones, unbind them and post new diags.

Well, unattaching the controllers brought all of the drives back, including the 4TB ones. I'm beyond stumped as to why that did it, but hey, it worked! I seriously appreciate the help both of you have given me today @JorgeB and @trurl

  • Like 1
Link to comment
1 hour ago, JorgeB said:

You're welcome, but no way the bound controllers were the result of moving the server, why I didn't look at that first, though still strange about the 4TB disks.

I don't know how I didn't encounter issues with it earlier, I'd rebooted a few times before the move and never encountered an issue. Sometimes servers work in mysterious ways...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.