Jump to content

Disk and parity issues after upgrading license


Recommended Posts

I took advantage of the sale and upgraded my license today from Basic to Pro. Part of that upgrade process required me to stop and restart the array. After stopping the array my server become somewhat uncontrollable via the WebUI. After a while I opted to reboot the server which it did. After that it took a while to come back up and immediately I started hearing a high pitched noise from one of my drives. Subsequent read errors appeared on one of the disks in the array followed by parity sync errors. My parity check normally takes about 3 hours to complete, however it's now on 25% after 6 and a half hours. The affected disk is one that only keeps my network backups and has no other data, fortunately!

 

I have no idea when it comes to diagnosing disk issues or what exactly could have gone wrong but while the parity check is running I can't run the SMART test on the drive. 

 

What am I looking at here? Replace the disk? Something that can be repaired? I'm wanting to get this sorted out so I can safely update to the latest stable release (from 6.9.2).

 

Any advice greatly appreciated.

 

TLDR: Please see if there's anything badly wrong in the diagnostics?

tower-diagnostics-20220618-2210.zip

Link to comment

So after reseating things and checking my server, I've booted up like normal and Disk3, the same disk giving the issues shows up as "Unmountable". In a short amount of research IO have found people talking about this: 

 

xfs_repair -v /dev/md3

 

and:

 

I want to make sure I'm on the right track, I don't actually care about the data on this particular drive so happy to format it but I'd like to try get to the bottom of it all and test to see if the drive is failing or if it's just a coincidental error.

Link to comment

I'm guessing it's this bit?

 

"Jun 20 04:40:14 Tower kernel: ata5.00: failed command: READ FPDMA QUEUED
Jun 20 04:40:14 Tower kernel: ata5.00: cmd 60/40:f0:80:5d:45/05:00:39:00:00/40 tag 30 ncq dma 688128 in
Jun 20 04:40:14 Tower kernel:         res 40/00:10:50:6b:45/00:00:39:00:00/40 Emask 0x50 (ATA bus error)
"

 

Will follow your advice re: replacing cables and follow up.

Link to comment
2 hours ago, JorgeB said:

Yep

 

1 hour ago, trurl said:

and you can see which disk that is associated earlier in syslog1.txt

Big thanks to you both, learning experience for sure!

 

Have attached the new diagnostics after having replaced the cable. Power cable looked fine to me and is supporting another drive on the same strand so I've only replaced the data cable. It's now showing as "Unmountable: not mounted".

tower-diagnostics-20220620-1804.zip

Link to comment
1 hour ago, trurl said:

Unrelated, but your appdata, domains, system shares are on the array. Docker/VM performance will be impacted by slower parity, and array disks can't spin down since these files are always open

Thanks, I prioritised my VM over the other stuff as I only had a small SSD, my other (NVMe) SSD keeps unmounting so for stability had to move it all onto the array. Should be able to get a new SSD for this exact purpose soon as I've also just upgraded to the Pro license!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...