Jump to content

Yet more error after a brief spell of being off


loady

Recommended Posts

It seems that everytime my server has been off for any length of time (this time i was upgrading and had to RMA the motherboard) when i got to use it again i am getting errors...

https://gyazo.com/a3477a333ee6f46d6c8970b33098226b

 

Disk three is saying 'device is disabled, contents emulated'

 

Disk four is stating normal operation, yet BOTH disks reporting unmountable no file system. The only thing different apart from the mobo and CPU upgrade is a raid card, 

 

https://www.amazon.co.uk/10Gtek-Internal-Express-Controller-SAS2008/dp/B01M2AC40Y/ref=sr_1_1?ie=UTF8&qid=1537007716&sr=8-1&keywords=SAS+controller+HBA&pldnSite=1

 

I have that one and it is a verified server pull so its not a chinese knock off one and it has been flashed to IT mode, i have not moved any of the disks.

 

What should i do to get this back up and running again so i can just leave it be ? Diags attached. Thanks

 

EDIT: checked and reseated all cables, i also swapped the SAS cable going to drive three for an unused one, things changed a little, no more unmountable filesystems but still this disk saying disabled, contents emulated.

warptower-diagnostics-20200811-1225.zip

Edited by loady
Link to comment
18 hours ago, johnnie.black said:

It did look like a power/connection problem.

 

Once a disk gets disabled it needs to be rebuilt, if the emulated disk is mounting and contents looks correct you can rebuild on top.

https://wiki.unraid.net/Troubleshooting#Re-enable_the_drive

Thanks for that, i read that and opted for a full rebuild rather than trust, i hadnt dont anything with the server, only switched it on, anyway, left it rebuilding last night and came back this morning to see that it had finished and not disk three is ok but disk four is now disabled and contents emulated....its getting ridiculous, seems like i fix one drive and another takes its place for errors...absoloutely nothing was touched moved or anything.

 

 

Link to comment
39 minutes ago, johnnie.black said:

Again looks like a power/connection problem.

How can a power connection problem move from one drive to another ? it doesnt make sense. I have 5 3.5" hdd in to 3 5.25 bay, is it possibly or even heard of that the power issue is just moving to the next drive ?, could the back plane be faulty ?im thinking the drive get enabled to do the rebuild but in doing so another one gets disabled ?

 

Ok, this time i dont want to rebuild the drive, i just want to trust whats on there

Edited by loady
Link to comment
2 minutes ago, johnnie.black said:

Could be a bad PSU, or just a cable/splitter if they share one, could also be the miniSAS to SATA breakout cable for example.

well, was all ok for a bit, all my dockers had disappeared and was just in the process of getting them back and disk 4 has just disabled right in front of my eyes, the mini SAS to SATA cables were brand new. All my SATA ports are empty, would it be a good idea to stop the array, remove the mini SAS to SATA cable from disk 4 and plug disk 4 into a free sata port ? if the problem arises again then it could suggest power issue and not cable problem...i have only started getting these problems when this powered drive cage was introduced and it was bought second hand i can hear that noise that a drive makes when you just cut power to it

Link to comment
48 minutes ago, johnnie.black said:

There were call traces during the sync, updating to v6.8 should fix that, then try again.

ok, updated, i had to pause the parity sync to do it, rebooted and the parity sync surviced the reboot in in pause mode but is still hanging there, do i need to stop it and start again

Link to comment
15 hours ago, johnnie.black said:

Parity sync pause doesn't survive a reboot.

Hmmm...server was hung and couldnt power down with telnet or SSH, had to hard power off. Anyway, left it doing a parity rebuild last night and should have been finished when i got here, it was still doing a parity rebuild but not the same one, from what i can see it started another one at around 3am, its got about an hour left right now but the disk 4 seem to be ok at this moment

warptower-diagnostics-20200814-0913.zip

Link to comment

it was due to finish, i heard the drives chatter, disk one is spun down and it seems to be hanging on the rebuild again...the estimated time to finish is going into days now. getting a bit tired of this everytime i turn the server on :(

 

Its definately stopped, all the drives spun down, i am sat here with a server with no parity at the moment, this time its stopped at 85.4%

warptower-diagnostics-20200814-1014.zip

Link to comment

Seriously, the worse thing you want to hear. Servers proper hung now, how do i shut it down via telnet or SSH ?..i cant keep hard shutting down, cant be good for the drives ?

16 minutes ago, johnnie.black said:

Server is still crashing:

 


Aug 14 09:30:39 Warptower kernel: BUG: unable to handle kernel paging request at 0000000000cd0038

 

Possibly a hardware issue, start by running memtest.

EDIT: i just held power button, drives are all spun down so not to much of a problem

Edited by loady
Link to comment
1 hour ago, johnnie.black said:

Type "powerdown" on the console, if it doesn't shutdown after a couple of minutes you'll need to force it.

yes had to force. I just thought i would double check the the ram is seated correctly, i then saw that i had forgotten to put the memory sticks in the correct slots, they were in A1/B2 when   they should have been in A2/B2 would this have some bearing on the issues ? should i go straight for another rebuild which will be about 8 hours as opposed to a memtest of 24/48 hours ?

 

EDIT: i am really at a loss as to what i should do, just rebooted and i could see the power LED for disk 4 light up then it would chirp and do it again, constantly, i rebooted again and now disk 3 is saying disabled, what the hell is going on, the PSU ?..the drive chasis ?, should i move as many of the drives to SATA and see if the problem persists ?i only have 6 sata ports on this and 7 drives, one being parity and one being cache.

Also, the parity drive rebuild has completed in 10 minutes on this reboot

 

warptower-diagnostics-20200814-1155.zip

Edited by loady
Link to comment
3 hours ago, johnnie.black said:

Type "powerdown" on the console, if it doesn't shutdown after a couple of minutes you'll need to force it.

Would you mind giving your opinion on this short video I made, I am getting suspicious of the drive chassis now, I've pulled disk 3 and 4 out and put them on sata ports on their own power source, no more chirping of any drives and the power led is not tripping like it was, I have yet another parity check started and sounds much better now, if all goes well and no more drives get spat out would this suggest that chassis has a power distribution issue and not the main PSU in case ?. It basically houses 5 drives and takes two molex plugs to power it, I tried powering it from molex from separate rails but still kept doing the chirping like it was power tripping, don't really want to buy a new chassis if the problem lies with the main PSU not being able to deliver the power, so the power from two molex is being split to power 5 drives

 

https://photos.app.goo.gl/QzdsyxWjM1dU8UF39

Edited by loady
Link to comment

think i got turned over with this chassis...had nothing but drive issues since the day i got it. So when its tripping out it essentially like pulling a drive out while the array is started. Wonder if i can get a new back plane for it, what if the [problem is that it cant draw enough power from my PSU, see the the 2 molex that power it are on the same string, would it make a difference if i were to use molex from 2 separate strings ?

 

Edited by loady
Link to comment
46 minutes ago, johnnie.black said:

It could.

what ill do then is when the parity has finished ill set it up that way, if i get disks disabling i can just stop array and do new config and check the box for parity is valid, should stop the drive from rebuilding itself again ?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...