Doggeh Posted November 25, 2022 Share Posted November 25, 2022 (edited) Hi folks, I'm running a NetApp DS4246 disk shelf, currently with 14 drives installed. This is connected to my server by means of a Broadcom / LSI SAS2308 PCI-E controller. The drives are all mounted as Unassigned Devices outside of the array. This was all working fine until the last couple of months where, roughly once a week, I'd log in and find that the drives are missing, sort of... They all say "Reboot" next to them and I can't find any information as to what this means or why it is happening. They are inaccessible in this state. Once I reboot everything is fine again. See the screenshot attached. My hypothesis is that the server is losing connection to the disk shelf for some reason but nothing seems to be obviously faulty and the server logs don't highlight anything that I can see. Has anyone experienced anything like this? Thanks. Edited November 25, 2022 by Doggeh Add more info Quote Link to comment
JorgeB Posted November 25, 2022 Share Posted November 25, 2022 Please post the diagnostics. Quote Link to comment
Doggeh Posted November 25, 2022 Author Share Posted November 25, 2022 7 minutes ago, JorgeB said: Please post the diagnostics. The problem hasn't happened for a few days now and I've done a couple of OS updates since then. Is it still useful to attach the diagnostics now or wait for the problem to happen again and then grab them before rebooting? The syslog file in this current diagnostic dump doesn't look like it goes back far enough to be useful. Thanks. Quote Link to comment
JorgeB Posted November 25, 2022 Share Posted November 25, 2022 38 minutes ago, Doggeh said: or wait for the problem to happen again and then grab them before rebooting? This. Quote Link to comment
Doggeh Posted November 29, 2022 Author Share Posted November 29, 2022 (edited) Right, it happened again this afternoon. Diagnostics attached. Would greatly appreciate any help or insights. Thanks! zeus-diagnostics-20221129-1643.zip I believe the problem occurred at "Nov 29 15:07:22" Edited November 29, 2022 by Doggeh Quote Link to comment
JorgeB Posted November 29, 2022 Share Posted November 29, 2022 On 11/25/2022 at 2:48 PM, Doggeh said: My hypothesis is that the server is losing connection to the disk shelf Yes, that's what it looks like, HBA should not be the problrem, suspect cable/enclosure problem, I would start with a different cable since it's the easiest and cheapest to replace. Quote Link to comment
Doggeh Posted November 29, 2022 Author Share Posted November 29, 2022 (edited) Ugh that's a pain. The only other thing I can think of is that i think it was working fine until I did a bit of maintennance and moved the rack (the timeline is a bit blurry in this point). As part of that process I'm pretty sure I plugged the disk shelf into my UPS whereas previously it was just plugged into the wall. I'm not having power outages or anything like that so it's probably still just the cable though I do wonder if the UPS isn't quite handling a power surge (or something like that). Probably still just the cable. Maybe I'll plug the disk shelf directly back into the mains for a week and see if that fixes it before ordering a new QSPF to miniSAS cable. Thanks for taking a look. Edited November 29, 2022 by Doggeh Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.