bobobeastie Posted March 8, 2020 Share Posted March 8, 2020 Friday or yesterday I downloaded 6.8.3, was on 6.8.2 at the time, and did not reboot. Woke up this morning and decided to reboot, web gui was not going to the reboot countdown screen, so I used putty to send reboot command. It boots back up and both parity drives are disabled, docker and auto disk start were enabled so it looked like writes were happening. I have 24 drives connected to the same 6 backplanes, 1 SAS card for 16 drives and 8 motherboard sata, so this doesn't seem like a cabling/controller issue to randomly only effect parity drives. I rebooted another time to see if that did anything. It's hard for me not to suspect the new version under these circumstances. nastheripper-diagnostics-20200308-0821.zip Quote Link to comment
bobobeastie Posted March 8, 2020 Author Share Posted March 8, 2020 Guessing ultimately I need to disable and re-enable both parity drives and then rebuild parity. But guessing with me and unraid have made things way worse in the past. I'm also concerned about what went wrong, and if this is a bug I want to help if possible. Quote Link to comment
JorgeB Posted March 9, 2020 Share Posted March 9, 2020 We can't see what happened but likely a controller error, when errors on multiple drives happen at the same time Unraid will disable as many drives as there are parity devices, which drives get disabled is luck of the draw, unlikely to be upgraded related, especially since if I understood correctly problem was before rebooting, and the new release would only be loaded after the reboot, when the disks were already disabled, just re-sync parity and try to get the diags before rebuilding if it happens again. Quote Link to comment
bobobeastie Posted March 11, 2020 Author Share Posted March 11, 2020 On 3/9/2020 at 4:23 AM, johnnie.black said: We can't see what happened but likely a controller error, when errors on multiple drives happen at the same time Unraid will disable as many drives as there are parity devices, which drives get disabled is luck of the draw, unlikely to be upgraded related, especially since if I understood correctly problem was before rebooting, and the new release would only be loaded after the reboot, when the disks were already disabled, just re-sync parity and try to get the diags before rebuilding if it happens again. I suppose there was some issue, and perhaps it was controller errors keeping the array from stopping so it could reboot. There were no errors sent in emails or in the webgui, but it's possible I didn't notice the parity drives being disabled. This is the third issue I have had with this server recently and it is making me a bit concerned/anxious. So here are some paranoid ideas I have had as to things that are causing issues: 1. My motherboard temp is being reported at around 96 C, this was initially concerning, but then I learned that ryzen CPU report 27 C over actual temps, but now I am back to concerned. My setup is Threadripper 1900X on X399 Designare EX, temp drivers are it87 k10temp, and the one reporting high is "k10temp-mb temp". Is this real? Seems unlikely, am I using the wrong driver? 2. My SAS card (link) (LSI 9201-16e) is a somewhat recent purchase in October last year, the listing says P20, but there are different P20 revisions, and it still has it's bios, which I did not flash on the last card I had when I updated it. I don't think that makes a difference after booting. Still I'm going to try to figure out which version I'm on and update if possible, looks like base P20 might be bad, and newest is 20.00.07.00. 3. Power related, the server is plugged in to a CyberPower CPS1215RMS Surge Protector, PSU is EVGA SuperNOVA 1000 G1+, which seems like a good quality PSU. I have a powerwall2 which I see takes 300-2000 millisecond relay switch-over, so based on https://teslamotorsclub.com/tmc/threads/powerwall-2-ups-connundrum-and-solution.130085/ it looks like I may want a UPS with a tiny battery that accepts 65Hertz power, I don't currently have a UPS. 4. Gremlins/Underpants Gnomes/Static Electricity? Probably grasping, but I'd like to try to avoid further issues. Quote Link to comment
JorgeB Posted March 11, 2020 Share Posted March 11, 2020 7 hours ago, bobobeastie said: Still I'm going to try to figure out which version I'm on It's already using latest firmware: Mar 8 08:15:24 NAStheRIPPER kernel: mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00) BIOS version (or if there's one) it's not important for Unraid. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.