Jump to content

Disks unmountable and another drive with thousands of errors


Go to solution Solved by JorgeB,

Recommended Posts

Hi community,

I am super nervous, as my server seems to be dying on me little by little. It all started with Disk 2 being disabled and emulated. Suddenly it was unmountable: wrong filesystem.

Due to help received here earlier, I was expecting some cabling problems. I opened up the server and tidied up the cabling and ensuring all drives were well connected with both SATA and power.
That was when DISK 2 became unmountable. I then followed the official instructions and deleted the log to repair the filesystem. That put 2.46TB in a lost+found folder. A lot, but not the end of the world. I went through it and it seemed to be only media, that i could possibly rename with filebot.

Went to bed thinking i had solved the issue, but woke up to the disk being unmountable again...And Disk 7 now had more than 300.000 errors.

I noticed that most of the content of the drives was gone from SMB. I rebooted and the content was back. 

Ordered a new, bigger, power supply, hoping that it is a power issue, although i wouid have thought my 750W PSU should be sufficient for my ten drives.

Havent had time to install it yet. The content on the disks has dissapeared once or twice again, but seems to come back when turning off array and starting it again.

I have removed DISK 2 from the array for now and it is now in unassigned disks. Because i kind of thought the emulated content would be more complete, if the drive filesystem cant be recognised anyway.

Now this afternoon, also DISK 3 has become unmountable....What is going on? Only two of the disks are on the same controller, so I am not sure that is it.

I wanted to post the most recent diagnostics here, before i, hopefully tomorrow, have time to replace the power supply.

alameda-diagnostics-20240505-1822.zip

Link to comment

What type of connection do you have to these drives? I have a couple of servers with raid controllers flashed to IT mode and use 8087 to 4 SATA connectors. I have had two of those special SATA cables fail. I have never found any of those cables that I felt were good quality. They usually throw crc failures in SMART data on the disks when they fail.

Link to comment
3 hours ago, wildfire305 said:

What type of connection do you have to these drives? I have a couple of servers with raid controllers flashed to IT mode and use 8087 to 4 SATA connectors. I have had two of those special SATA cables fail. I have never found any of those cables that I felt were good quality. They usually throw crc failures in SMART data on the disks when they fail.

Hi Wildfire. Thank you for responding.

Two of the failing drives are on one of those cables. But the third is connected directly to a SATA port on the motherboard with a SATA cable. I do have a new "4 SATA" cable, that i will try to use now as well. Just to rule out anything with that.

Link to comment

Hi Jorge,

Thanks for chipping in. I have just replaced the PSU and as many of the power cables as I had spares for.

So far everything looks more healthy. I repaired file systems on disk 2 and 3 and they are both recognised for now.

I have paused the data rebuild as I am nervous about all the data that is missing, that belonged on disk 2. It was starting to be absent when disk 2 was emulated as well. My thinking is that the disk has all the data, but now the emulation from the parity has an incomplete picture? Apologies if i am not using the right terms.

In any case, would it even be possible to activate disk 2 without data rebuild? Or should i just bite the bullet and let it data-rebuild and also let that be a stress test for the new power supply? The problems i have been having seem to alway occur during parity checks or data rebuilds. I am guessing because everything is then spun up.

 

Link to comment

Hi Jorge. I'm out of the house at the moment. Will post diagnostics when I'm back.

When both disks were unmountable I was missing loads of data. Obviously because, as you say, I can only emulate one disk. 

I am now back to only about 2,5tb data missing. Perhaps this is all what has now gone in the lost+found folder? 

 

 

Link to comment

Thanks. Rebuilding now.

I guess i will see if i can find all those Linux Distros again. 

 

In any case, I should be able to see within the next day or so, if it was really the PSU that was the culprit. I guess it would explain a lot of random issues i have been having the last year or so.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...