Jump to content
We're Hiring! Full Stack Developer ×

Both Parity Drives Disabled


Recommended Posts

Friday or yesterday I downloaded 6.8.3, was on 6.8.2 at the time, and did not reboot.  Woke up this morning and decided to reboot, web gui was not going to the reboot countdown screen, so I used putty to send reboot command.  It boots back up and both parity drives are disabled, docker and auto disk start were enabled so it looked like writes were happening.  I have 24 drives connected to the same 6 backplanes, 1 SAS card for 16 drives and 8 motherboard sata, so this doesn't seem like a cabling/controller issue to randomly only effect parity drives.  I rebooted another time to see if that did anything.  It's hard for me not to suspect the new version under these circumstances.

nastheripper-diagnostics-20200308-0821.zip

Link to comment

We can't see what happened but likely a controller error, when errors on multiple drives happen at the same time Unraid will disable as many drives as there are parity devices, which drives get disabled is luck of the draw, unlikely to be upgraded related, especially since if I understood correctly problem was before rebooting, and the new release would only be loaded after the reboot, when the disks were already disabled, just re-sync parity and try to get the diags before rebuilding if it happens again.

Link to comment
On 3/9/2020 at 4:23 AM, johnnie.black said:

We can't see what happened but likely a controller error, when errors on multiple drives happen at the same time Unraid will disable as many drives as there are parity devices, which drives get disabled is luck of the draw, unlikely to be upgraded related, especially since if I understood correctly problem was before rebooting, and the new release would only be loaded after the reboot, when the disks were already disabled, just re-sync parity and try to get the diags before rebuilding if it happens again.

I suppose there was some issue, and perhaps it was controller errors keeping the array from stopping so it could reboot.  There were no errors sent in emails or in the webgui, but it's possible I didn't notice the parity drives being disabled.

 

This is the third issue I have had with this server recently and it is making me a bit concerned/anxious.  So here are some paranoid ideas I have had as to things that are causing issues:

 

1. My motherboard temp is being reported at around 96 C, this was initially concerning, but then I learned that ryzen CPU report 27 C over actual temps, but now I am back to concerned.  My setup is Threadripper 1900X on X399 Designare EX, temp drivers are it87 k10temp, and the one reporting high is "k10temp-mb temp". Is this real?  Seems unlikely, am I using the wrong driver?

 

2. My SAS card (link) (LSI 9201-16e) is a somewhat recent purchase in October last year, the listing says P20, but there are different P20 revisions, and it still has it's bios, which I did not flash on the last card I had when I updated it.  I don't think that makes a difference after booting.  Still I'm going to try to figure out which version I'm on and update if possible, looks like base P20 might be bad, and newest is 20.00.07.00.

 

3. Power related, the server is plugged in to a CyberPower CPS1215RMS Surge Protector, PSU is EVGA SuperNOVA 1000 G1+, which seems like a good quality PSU.  I have a powerwall2 which I see takes 300-2000 millisecond relay switch-over, so based on https://teslamotorsclub.com/tmc/threads/powerwall-2-ups-connundrum-and-solution.130085/ it looks like I may want a UPS with a tiny battery that accepts 65Hertz power, I don't currently have a UPS.

 

4. Gremlins/Underpants Gnomes/Static Electricity?

 

Probably grasping, but I'd like to try to avoid further issues.

Link to comment
7 hours ago, bobobeastie said:

Still I'm going to try to figure out which version I'm on

It's already using latest firmware:

Mar  8 08:15:24 NAStheRIPPER kernel: mpt2sas_cm0: LSISAS2116: FWVersion(20.00.07.00), ChipRevision(0x02), BiosVersion(07.39.02.00)

BIOS version (or if there's one) it's not important for Unraid.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...