Nodiaque Posted June 27 Share Posted June 27 Hello, 2 days ago, I woke up to a disc emulated. Stopped the array, mount in maintenance mode, run disk test, everything is fine. Ok, rebuild the entire array. I also swapped the data cable (connected to a hba card) and shutdown my server to do some inspection (and dusting). This morning, samething but I also have a warning saying I have 2 disk with read error on the array. I'm unsure what is going on. Disk 5 - ST16000NM001G-2KK103_ZL2NQ6RK (sdi) (errors 313227) Disk 6 - ST16000NM001G-2KK103_ZL2NR6GL (sdg) (errors 1882) I'm unsure what to do now. Do I have failing disks? Thank you servraid-diagnostics-20240627-0829.zip Quote Link to comment
JorgeB Posted June 27 Share Posted June 27 It's not reported as a disk problem for both, and SMART looks OK, that and the fact that the issue started in both disks at the same time, suggests a power/connection issue, do the disks share something other than the miniSAS cable, like a power splitter? Quote Link to comment
Nodiaque Posted June 27 Author Share Posted June 27 (edited) They are into a Startech 4 disk enclosure that have 2 power for 4 disk and a fan. I don't have enough space in the server for my drives so I was looking for something external that could safely keep them and came up with that. The disk that disconnected again have a new cable that is coming from the other 4 wire channel of the HBA card. https://www.amazon.ca/gp/product/B00OUSU8MI/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1 My card is lsi 9201 https://www.amazon.ca/gp/product/B0BVTJPZSG/ref=ppx_yo_dt_b_search_asin_title?ie=UTF8&psc=1 I did ran a smart short test success on both and now running short long to check edit: I'm wondering if I'm running out of power from the psu. I have a Dell Precision 5820. This thing has limited power and I think I might be running out? It does go through some splitter cause it only have like 4 headers for power on the psu, which is so stupid considering that thing has 12 sata ports Xeon Gold W-2275 with 128GB ECC Ram NVidia RTX 4000 2x NVME 1.0 TB SSD 6x 16TB Sata HDD 1x 8TB Sata HDD 1x 4TB Sata HDD I added some fan (and one was required for the RTX 4000 installation from Dell). Maybe I should try to find another HDD enclosure but with external power that I can connect like this one, unsure if that exist Edited June 27 by Nodiaque Quote Link to comment
JorgeB Posted June 27 Share Posted June 27 Could be a power issue, you can also try swapping both disks with two other ones, and then see if the issue follows them, or stays with the same slots in the enclosure. Quote Link to comment
Nodiaque Posted June 27 Author Share Posted June 27 ah yeah good idea. But since I have now a disconnected drive, it mean again rebuild time so I won't do this before having it green again. Would be bad to have 2 emulated drive in a single parity config Quote Link to comment
Nodiaque Posted June 27 Author Share Posted June 27 I'm waiting for rebuilt but already, samed disk 5 made same number of error again during rebuild. Waiting to see if it get worst, might also be a failing controller. I'll try putting one of the drive outside on aother power cable (hopping it doesn't split to the same upward) after the rebult Quote Link to comment
Nodiaque Posted June 27 Author Share Posted June 27 I think I have a controller problem... disk 6 is being rebuild. disk 5 made the read error. downloads is another disk that never did any error... All on the same external HHBA card, 2 different set of cable Quote Link to comment
Nodiaque Posted July 1 Author Share Posted July 1 (edited) Ok so here's the "new" situation. 2 days ago, I took out disk 5 and disk 6 of the startech 4-bay unit and plug them directly into a power and sas->sata cable and rebuild. This morning, it's Disk 5 that is now disconnected and download disk has invalid path (like last time). download disk is still into the startech bay but I'm more concern about the disconnected disk. It usually happen during night at the backup time (well, I think, all I know is that all backup fail during that time). I think this time it was during a parity check because I'm att 33% of parity check and it stopped. I'm wondering if it's the HBA card that's failing, the power that's not enough or something else. servraid-diagnostics-20240701-0836.zip Edited July 1 by Nodiaque Quote Link to comment
JorgeB Posted July 1 Share Posted July 1 34 minutes ago, Nodiaque said: I'm wondering if it's the HBA card that's failing, the power that's not enough or something else. Could be, it still not logged as a disk problem, also see if it's happening right after a spin up, some Seagate disks have issues with spin up when used on an LSI, especially SAS2 models, so if it's after a spin up try disabling spin down to test. Quote Link to comment
Nodiaque Posted July 1 Author Share Posted July 1 (edited) that would be weird that it took all this time. Is there another card I could use to not have this problem since all my disk are seagate? Right now, I'm connecting a 2nd psu that will power the external drive to see if it's a power issue. I'll also put the never spin down on the disk 5 which this time disconnected. I though maybe the mainboard is having issue (since the HDD controller on the mainboard is fried) but all the disk on the hba card would go down if it was shorting it. edit: there's also the invalid path that I don't get. Now it's the download drive, last time it was download and another drive. The download drive is still in the startech with another drive though. Very weird. Edited July 1 by Nodiaque Quote Link to comment
JorgeB Posted July 1 Share Posted July 1 The download path issue may be filesystem related, but you syslog is full of spam, making it very difficult to analyse. Quote Link to comment
Nodiaque Posted July 1 Author Share Posted July 1 yeah the nvidia stuff is really bugging me. In the gui, it says it's the version the kernel want that's installed Since my array is unstable, I've yet to upgrade to latest unraid version. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.