Ver7o Posted November 27, 2019 Share Posted November 27, 2019 (edited) Hello guys, I run my VMs off an unassigned SSD, which runs great. The problem is when overnight I turn my VMs off and the drive eventually goes to sleep (from green to grey), it also then eventually goes missing (appears as historical devices of something like that) and remains missing until I reboot the server. I was thinking if maybe there was a way to make the disk never "spin down" or if there is something else happening. I attached my diagnostic file, but will it even help now that I have already rebooted? Thank for any answers! tower-diagnostics-20191127-1355.zip Edited December 10, 2019 by Ver7o Quote Link to comment
JorgeB Posted November 27, 2019 Share Posted November 27, 2019 36 minutes ago, Ver7o said: but will it even help now that I have already rebooted? We need pre-reboot diags. Quote Link to comment
Ver7o Posted November 27, 2019 Author Share Posted November 27, 2019 (edited) So this is a fresh diagnostic. I rebooted the system and didn't start a VM for a few minutes. Unassigned disk went to sleep and now its under missing historical devices and cannot be woken up unless I reboot. From what I can see now, there was an i/o error on sdc (the unassigned drive) which caused to drive to unmount. As a curiosity question, which file system would u guys recommend for an unassigned drive running windows, linux and mac VMs? Nov 27 18:21:13 Tower kernel: print_req_error: I/O error, dev sdc, sector 1953662178 Nov 27 18:21:13 Tower kernel: XFS (sdc1): metadata I/O error in "xlog_iodone" at daddr 0x747284a2 len 64 error 5 Nov 27 18:21:13 Tower kernel: XFS (sdc1): xfs_do_force_shutdown(0x2) called from line 1271 of file fs/xfs/xfs_log.c. Return address = 0000000084c58612 Nov 27 18:21:13 Tower kernel: XFS (sdc1): Log I/O Error Detected. Shutting down filesystem Nov 27 18:21:13 Tower kernel: XFS (sdc1): Please umount the filesystem and rectify the problem(s) tower-diagnostics-20191127-1727.zip Edited November 27, 2019 by Ver7o Quote Link to comment
JorgeB Posted November 27, 2019 Share Posted November 27, 2019 Device is dropping offline, this looks very suspiciously VM related, like you're using that device for a VM, 26:00.0 is the LSI HBA, are you starting any other VM, i.e., if all VMs are left off the disk still disappears? Nov 27 18:20:43 Tower kernel: vfio_ecap_init: 0000:26:00.0 hiding ecap 0x1e@0x258 Nov 27 18:20:43 Tower kernel: vfio_ecap_init: 0000:26:00.0 hiding ecap 0x19@0x900 Nov 27 18:20:43 Tower kernel: sd 9:0:1:0: attempting device reset! scmd(0000000074fec607) Nov 27 18:20:43 Tower kernel: sd 9:0:1:0: [sdc] tag#0 CDB: opcode=0x28 28 00 3a 38 36 18 00 02 f8 00 Nov 27 18:20:43 Tower kernel: scsi target9:0:1: handle(0x0009), sas_address(0x4433221101000000), phy(1) Nov 27 18:20:43 Tower kernel: scsi target9:0:1: enclosure logical id(0x500605b005492dd0), slot(2) Nov 27 18:20:43 Tower kernel: sd 9:0:1:0: device reset: FAILED scmd(0000000074fec607) Nov 27 18:20:43 Tower kernel: scsi target9:0:1: attempting target reset! scmd(0000000074fec607) Nov 27 18:20:43 Tower kernel: sd 9:0:1:0: [sdc] tag#0 CDB: opcode=0x28 28 00 3a 38 36 18 00 02 f8 00 Nov 27 18:20:43 Tower kernel: scsi target9:0:1: handle(0x0009), sas_address(0x4433221101000000), phy(1) Nov 27 18:20:43 Tower kernel: scsi target9:0:1: enclosure logical id(0x500605b005492dd0), slot(2) Nov 27 18:20:43 Tower kernel: scsi target9:0:1: target reset: SUCCESS scmd(0000000074fec607) Nov 27 18:20:44 Tower kernel: sd 9:0:1:0: Power-on or device reset occurred Nov 27 18:20:44 Tower kernel: mpt2sas_cm0: attempting host reset! scmd(0000000074fec607) Quote Link to comment
Ver7o Posted November 27, 2019 Author Share Posted November 27, 2019 (edited) In this particular case I tried to run a Linux VM (with GPU passthrough), before I noticed sdc was already missing, so naturally nothing happened because the vdisk was not found. I have no clue what happened to the disk before that. As far as I know group 26 refers to the GPU. 26:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2070] (rev a1) But yeah, it sure looks like thats the case. If the VMs are off, the disk disappears. Before this diagnostic, I just rebooted unraid and went afk for a while without any VM started. Edited November 27, 2019 by Ver7o Quote Link to comment
JorgeB Posted November 27, 2019 Share Posted November 27, 2019 2 minutes ago, Ver7o said: As far as I know group 26 refers to the GPU. You're right, HBA is 25:00.0, my mistake, there have been some spin related issues with devices connected to LSI HBAs lately, first thing you should try is updating the firmware since it's very old, current one is 20.00.07.00, if that doesn't help and since IIRC spin down settings don't have any effect on unassigned devices you can try connecting that disk to an onboard SATA possible if possible. Quote Link to comment
Ver7o Posted November 27, 2019 Author Share Posted November 27, 2019 (edited) I'll try pluging the disk to onboard sata and I'll report back in a few days if anything changes. I will also look into how to flash new firmware onto my lsi card. Thank you! Edited November 27, 2019 by Ver7o Quote Link to comment
Ver7o Posted December 6, 2019 Author Share Posted December 6, 2019 Update: Plugging the unassigned SSD directly to the motherboard sata did seem to solve the issue. I haven't however updated the FW on the raid card. I think the thread can be marked as solved. Thank you @johnnie.black 🍻 Quote Link to comment
JonathanM Posted December 6, 2019 Share Posted December 6, 2019 5 hours ago, Ver7o said: I think the thread can be marked as solved. Edit the first post in the thread and change the title. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.