August 19, 20187 yr Hello All, I've been trying to troubleshoot an issue I started to get after switching to a 1U HP Server where random HDD's will start reporting errors. If I leave it long enough, the drive will go disabled due to not able to write, however, as soon as I start to see errors and restart the server clears, and I am good for a week or two. I use an internal LSI SAS controller with two SAS to sata cables for the drives. I've since switched to external LSI controller and new external SAS cable to sata cables and still have the same issue. I thought the external HDD case was the issue, so I decided to remove that from the picture and used a separate PC power supply (with jumper) and connected to the drives directly, however that doesn't help. (note: my friend is doing this with zero issues) I read that memory might be an issue but due to having dual CPU's i don't know if I can remove sticks (I have a lot of them). Thoughts? Can memory trigger these types of issues? Before people start advising the external drive setup if the cause, I have four internal bays with direct onboard sata cables and these too have issues. It seems to be completely random on what drive starts with the errors, and if you search my name, you will see a few other posts with issues rebuilding what I thought was bad drives, but I now know replacing drives with newer ones is not fixing anything either. When I get the email error alert, I see sometimes one drive with a single error, and sometimes i check its 5 of 7 drives with errors, again with a reboot, its fixed for sometime. I have attached a new diag report after a restart, but I can start grabbing them once I see errors again before I reboot. Some of my old post with errors also have diag reports and I assume they are all the same issues and not what i thought was bad drives. server-diagnostics-20180819-0613.zip Edited August 19, 20187 yr by rcmpayne
August 19, 20187 yr Community Expert If I understand correctly you already replaced the HBA and are using a different PSU with the drives, my next suspect would be the motherboard, I would consider very unlikely CPU and RAM causing these.
August 19, 20187 yr Author 9 hours ago, johnnie.black said: If I understand correctly you already replaced the HBA and are using a different PSU with the drives, my next suspect would be the motherboard, I would consider very unlikely CPU and RAM causing these. I have a Dell R415 1u I can try. Can I just swap the lsi controller, usb boot drive + the hdds and boot in the new system?
September 1, 20187 yr Community Expert On 8/20/2018 at 12:22 AM, rcmpayne said: Can I just swap the lsi controller, usb boot drive + the hdds and boot in the new system? It should work.
September 1, 20187 yr Author Ok I did that new Dell server and started getting the errors again. Restart fixed it. Last night one drive went disabled. Trying to fix that now but all the xfs switches I try to set it back to green icon. It's disk7 which is a 2 month old WD red server-diagnostics-20180901-0325.zip
September 1, 20187 yr Community Expert Diags are after rebooting so not much help, but if you already replaced HBA, PSU and now the same happens with a different server it's very odd, SMART for the disabled disk looks fine.
September 1, 20187 yr Author Yea I am at a loss. New server, new lsi controller, new cables, new PSU, new ram, new hdds. So how can I get this disabled drive back its brand new?
September 1, 20187 yr Community Expert You can rebuild or do a new config (if no new data was written to that disk since if got emulated), rebuild would recommend only if using a new disk, since the server is unstable any issues during the rebuild might leave it worse than it is now.
September 1, 20187 yr Author If I do a new config, will it rebuild the missing data from that drive?Sent from my Pixel 2 using Tapatalk
September 2, 20187 yr Community Expert 12 hours ago, rcmpayne said: If I do a new config, will it rebuild the missing data from that drive? No, it would reenable the disabled disk as it was before it got disabled, it would lose any data saved to the emulated disk, if there was any.
September 3, 20187 yr Author Ok I backed up the drive to a spare and removed and re-added it to the array. After it rebuilt, everything was backed and matched the backup I took. Sent from my Pixel 2 using Tapatalk
Archived
This topic is now archived and is closed to further replies.