dp12776 Posted April 27, 2023 Share Posted April 27, 2023 Hi Community, On my server i had a lot of old smaller drives and a few bigger newer drives. I started getting a problem where a disk(not always the same) would get exactly 2048 errors and the disk would disable. Running extended SMART check would show nothing wrong with the disk and it could re-build like normal. The fault would appear, i think, when the mover was running or during Parity-check. I decided that either my HBA card or PSU was at fault(spinning up too many drives at once) and removed all the old small drives using the non-parity protected method. I also replaced my HBA card, just in case this was faulty. Parity just rebuilt over the last couple of days and everything looked sweet until this morning at 0700, when the mover started running. One of the new disks, that hasnt had a fault before, posted 2048 errors again and was disabled. Why the exact same number of errors each time? Why no problem at parity sync or normal operation but only when mover started working? Am i looking at a bad SATA cable? Please see the logs attached. Any help appreciated. This is making me nervous. Daniel alameda-diagnostics-20230427-0747.zip Quote Link to comment
JorgeB Posted April 27, 2023 Share Posted April 27, 2023 Is it always one of the Seagate disks connected to the LSI HBA? Quote Link to comment
dp12776 Posted April 27, 2023 Author Share Posted April 27, 2023 8 hours ago, JorgeB said: Is it always one of the Seagate disks connected to the LSI HBA? Hi Jorge, I am not really sure. It has happened 3-4 times. The last two times was on the 12TB Seagate drive, before i removed all the old drives. I dont recall which drive it was before that(if it even happened. i should have kept a better eye on what has been going on). Under all circumstances it is very likely that it has been drives connected to the LSI HBA board each time. It certainly was this time. This morning i swapped to a different cable from the HBA board, on the affected 14TB drive and have now started the rebuild. I am definitely thinking that it is not a coincidence that it is always 2048 errors? Quote Link to comment
Solution JorgeB Posted April 27, 2023 Solution Share Posted April 27, 2023 I would recommend connecting those 12TB Seagate to the onboard SATA, swap with other model disks if needed, there have been some issues with LSI SAS2 HBAs and high capacity Seagate disks. Quote Link to comment
dp12776 Posted April 27, 2023 Author Share Posted April 27, 2023 Hi Jorge, that is good to know. I have three of the seagate disks. I could easily make sure those are on the onboard SATA instead of the HBA. I will try to re-organize when the rebuild is done in a day or so. Guess i wont be buying any more Seagate disks. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.