confirmed powering issue [SOLVED]


Recommended Posts

so i have had trouble with disk three deing spat out/disabled..tried many things and got fed up and bought a new drive, i did a parity/rebuild swap then used the former parity drive to have disk three rebuilt onto..server seems to have been fine for a couple of weeks, it has been left on 24/7 instead of turning on when i need it. 

 

Today i am preclearing the drive that was formerly disk three as i did a full test on it and came back 100%...i just this minute wanted to grab some files from the array, the drives were all spun down and i could see them spin up then i heard that ominus chirp from the drive, repeating like its being stopped..then disk three showed up as disabled, so i know it wasnt the disk..im at a loss as to where to look next, i had the same issue when the array was connected via sata ports, i bought a SAS RAID card from a server pull so it wasnt a chinese knock off, the SAS cables are all brand new, its always the disk that is assigned as disk three, not matter what drive is in there, i cant see it being the main power supply for the motherboard because why is it always disk three ? same for the raid card why always disk three ? could it be the backplane to the case effecting that slot ?...the case was brand new 24 bay rackmount case

 

Any suggestion where to look next, where should i throw the next bundle of cash at ?

 

Can i just put the disk back into the array without having it rebuild again, i know the disk is good and nothing is being written to the array, i am however running a pre clear on a new disk

warptower-diagnostics-20210609-1326.zip

Edited by loady
Link to comment

Logs are spammed with docker messages and rotated, can't see initial boot info and the LSI firmware, make sure it's using the latest one, 20.00.07.00, other p20 releases especially have known issues, other than that you can try disabling spin down to see if it helps, even if it's for that disk only.

Link to comment
On 6/9/2021 at 3:12 PM, JorgeB said:

Logs are spammed with docker messages and rotated, can't see initial boot info and the LSI firmware, make sure it's using the latest one, 20.00.07.00, other p20 releases especially have known issues, other than that you can try disabling spin down to see if it helps, even if it's for that disk only.

Ok, ill do a reboot and then grab another di, the LSI firmware is for the raid card ? i know when i bought it that it was flashed with some firmware that unraid rewquired for it to work properly

Link to comment
20 hours ago, JorgeB said:

Jun 10 17:06:36 Warptower kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(07.39.02.00)

It's already on the latest one.

 

ok...so maybe to rule out the slot itself, i just want to add the drive back into the array, there have been no writes to it..then i will put it in a different slot and see if it happens again, how would i go about that

Link to comment

Took another look a the  last diags to see if the emulated disk was mounting and there are more disk errors, now disk4:

 

Jun 10 17:07:44 Warptower kernel: sd 11:0:2:0: [sdd] tag#801 ASC=0x4 ASCQ=0x0
Jun 10 17:07:44 Warptower kernel: sd 11:0:2:0: [sdd] tag#801 CDB: opcode=0x28 28 00 ae ad 2b 40 00 03 38 00
Jun 10 17:07:44 Warptower kernel: blk_update_request: I/O error, dev sdd, sector 2930584384 op 0x0:(READ) flags 0x0 phys_seg 103 prio class 0
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584320
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584328
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584336
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584344

 

Looks more like a power/connection problem.

Link to comment
17 hours ago, JorgeB said:

Took another look a the  last diags to see if the emulated disk was mounting and there are more disk errors, now disk4:

 


Jun 10 17:07:44 Warptower kernel: sd 11:0:2:0: [sdd] tag#801 ASC=0x4 ASCQ=0x0
Jun 10 17:07:44 Warptower kernel: sd 11:0:2:0: [sdd] tag#801 CDB: opcode=0x28 28 00 ae ad 2b 40 00 03 38 00
Jun 10 17:07:44 Warptower kernel: blk_update_request: I/O error, dev sdd, sector 2930584384 op 0x0:(READ) flags 0x0 phys_seg 103 prio class 0
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584320
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584328
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584336
Jun 10 17:07:44 Warptower kernel: md: disk4 read error, sector=2930584344

 

Looks more like a power/connection problem.

 Do you think the power supply could be the issue ?..it seems to be the only constant left, the mobo was new and was doing this on the former one, the case is all new, the server card was purchased because i thought it was possibly the drive bays i was using, the power supply has been in use for a long time, its 750w i think, its running 7 drives, mobo and an i7 

Link to comment
23 hours ago, JorgeB said:

Strong possibility.

Then i will buy a 1000w new power supply. 

 

I dont understand why the disk three is now unmountable as well as disabled, disk four came out of nowhere and is just unmountable, how do i add them back in without rebuilding ?

 

EDIT: ill start a separate thread for sorting the drives and leave this thread for when the new psu comes.

Edited by loady
Link to comment
7 minutes ago, JorgeB said:

Because there were errors on another disk, Unraid can't emulate it any longer, rebooting should fix it, it there aren't any more errors on different disks.

That worked thanks. i attached diags again, i can see the errors counting up on disk 4, i want to put disk three back into the array without rebuilding it, do i stop array, do new config retaining slot allocation and then check parity is valid box before starting ?

warptower-diagnostics-20210613-1041.zip

Link to comment

new config has added disk 3 back in..now disk 4 is saying disabled....losing the will to live here..i have done new config 4 times now...each start of the array alternates disk 3 or 4 being disabled..currently disk 3 is disabled again, if i do new config and and start, disk 3 is not disabled but disk 4 will be....it must be a power issue

Edited by loady
Link to comment
6 minutes ago, JorgeB said:

IMHO bad idea to try to fix the array while there are multiple disk errors, need to fix that first, i.e., try another PUS, controller, cables, etc.

just ordered a new 1050 watt psu, ill leave it with the one disk disabled for now...install the new psu when it comes, the the raid card to SAS cables are all brand new, this issue kept happening before i had this raid card and SAS cable, when the drives were connected to SATA ports on former mobo this kept happening...fingers crossed its the psu but i doubt ill be that lucky, its more like witchraft issues like this

Link to comment
  • 3 weeks later...

I got the new PSU installed, a 1000w modular jobby and everything seems fine, certainly not hearing that chirping sound from various disks, so not sure if the pSU was dying or whether it just wasnt powerful enough when all the disks would spin up, it was only effecting the one row of drive bays. Anyway all good now, and the disk that kept getting disabled has been precleared and put back into the array and is 100% fit...so now i will go grab other disks that were etting disled and see if they can be reintroduced.

  • Like 1
Link to comment
  • loady changed the title to confirmed powering issue [SOLVED]

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.