June 25, 20233 yr This only seems to be an issue since I updated to 6.12.x but I have a feeling that is just a coincidence and my HBA is struggling with the heat. So after a few issues since updating to 6.12.0 then to 6.12.1 I thought I had them ironed out but apparently not.... In order to update to 6.12.1 I had to remove my usb to make the update due to gui not being available on 6.12.0. Following the update I was starting to get CRC errors across all my drives. I assumed this was due to a wire connection issue probably caused by the vibration of me taking out the usb (server in an awkward spot and had to be moved slightly to get at the usb). I since moved my server fully to a more accessible location and reconnected all my SAS wires. This seemed to solve my issue. No errors at all. Due to other issues I started a parity check and now 4 hours in I'm starting to get the crc errors again. They are across several drives. Its small amounts on most and none above 1000 that I noticed. I have a 9300-16i HBA and I'm concerned it might be overheating and causing the issue? I do have a 3000rpm fan firing straight down on the HBA and the other cards beside it. For a long term solution, if this is the case, would a newer HBA like a 9400 or 9500 run at lower temps? I have included diagnostics while parity is still running. I'm going to pause the parity for now and schedule it to run for shorter periods over night to try mitigate the issue for now. server-diagnostics-20230625-1314.zip
June 25, 20233 yr When was the last time you opened up the case and thoroughly cleaned the dust and dirt out of it? Pay attention to the various heatsinks and make sure that you get them clean. Make sure that the fans are setup so that the air enters the case only from the front, going first over the disk drives and is pulled out of the back by the fans located there.
June 25, 20233 yr Author I built the server in February and cleaned it out yesterday and re seated all the cables while I was at it. Server is clean as a whistle. The 9300 is an ebay buy so it's possible the heatsink could do with a re paste. But I'd rather not get into doing that. It's just strange timing that it happens with the update.
June 25, 20233 yr How do you know the hba is overheating? I have the same one, and I don't think it has a temperature sensor. If it's just crc errors, it may just be cables. SATA cables and connectors unfortunately are one of the worst when it comes to pc connectivity, and even a bit of unsettling with vibrations or other activity can dislodge or wear them out
June 25, 20233 yr Author I just think it is... as i already said I have re-seated the cables and the errors went away under normal use. But when I started a parity check after a couple of hours it started to spit out the errors again. And now that I have stopped the parity check its back to normal use. Its just the only thing I can think of thats causing it at this point. The hardware was running on 6.11.5 since February with no issues... went through a few parity checks. It has gotten a lot warmer since then ofc. So its the only thing I cant think of at this point.
June 25, 20233 yr 3 hours ago, tazire said: I have a 9300-16i HBA Did you purchase a refurbished/used card or a 'new' one. The reason for this question is that this card is no longer manufactured by LSI (or its subsequent companies). If you purchased a new 9300-16i, probably only the chip set is made by LSI. The reminder of the card is the result of reverse engineering of the original LSI card. (They have been know to copy these right down to the paper labels on the card!) These cards are often referred to as 'counterfeit'. There are a several problems with this situation. How many manufacturers are actually making these cards? Why won't they claim ownership of their production? That is not to say that some of them are not producing a quality product but how does the purchaser know which ones are? There is also the problem of cost cutting. Unfortunately, one area for this activity is the heatsink. I have seen pictures of counterfeit cards and the heatsinks are often smaller than on the original LSI manufactured cards.
June 25, 20233 yr Author Yea I was aware of that issue when I was ordering I checked the serial number when I got it and it was a legit SN. But again thats assuming they didnt copy the SN of an original. As I said though it has worked absolutely perfectly since Feb. Its only causing issues since I updated to 6.12.x. It was well stressed out and gone through multiple parity checks prior to the update without any issues what so ever. It has been much hotter here recently and thought that maybe the update was just a coincidence but tbh its not that hot today 16 degrees so I didnt expect it would be an issue. I dont know why it would all of a sudden start having issues when it was rock solid for so many months. I was just making the assumption it was overheating because so many drives were having the issue and only when stress has been put on the HBA. I was thinking of bumping up to a higher end HBA so that it wouldnt be stressed as much by spinning rust drives during parity checks in the hope that it wouldnt get as hot but I'd like to troubleshoot this one as best I can before I invest more money into it. Also I have no experience with them so have no idea if my theory on them running cooler is true. EDIT: it was marked as used/refurbished I believe. I bought it from a crowd in Germany in the hope of avoiding the Chinese knock offs. But you never know i guess. Edited June 25, 20233 yr by tazire
June 25, 20233 yr As you probably realize, CRC errors are 'soft' errors and they are always corrected before the data transfer goes any further. But you still should not be getting them with properly functioning hardware. Cables are usually the first suspects. (I would recommend that you get cables with .5M lengths rather than the more common 1M ones. Two reason, (1) shorter cables are less susceptible to cross-talk and (2) their shorter length means better air flow through the case.) Second item would be the card. There is the 9305-16i version of this card with a newer chip set. I believe it is still being produced by Broadcom (LSI's successor) but it is more expensive. If you are going to test by substitution, double check the refund policy of your supplier!
June 26, 20233 yr Author Yea I already have 0.5m cables. And air flow itself is fine from what I can gather. The 24 hard drives are sitting at 30-35 degrees. The 10nvme's are sitting at 35-45 degrees. And just to be on the safe side I put a 3000rpm fan sitting over the HBA's in the system blowing straight down on them. I think my only option at this point is to try a different HBA. Thanks for the help. Its just the fact that its worked from Feb to Jun with 0 issues until I updated the system. I know that is likely just coincidence but there is also a part of me that is wondering is there some system change that is taxing the HBA more as a result. The HBA should ofc be able to handle it... i'm just curious as to why its been perfectly fine for so long. Anyway thanks for your insight. Hopefully a HBA change will sort my issue. I wont get to do it for a while... for now at least if I dont run a parity check I get no issues.
June 26, 20233 yr I would say it's a coincidence. I had a similar problem with 9200-8e. After replace it with dell h200e, everything works normally.
March 25, 20251 yr Sorry to necro this old thread but i have been trying to find other possible reasons on why my drives suddenly started to get issues after having worked for a long time with no issues. I had been having issues with various drives suddenly having CRC issues. that have all worked fine for quite some time without any issue. I cannot recall the upgrade path or timings for when i updated to 6.12.3 but i know the CRC issues seem to have been a thing sice then. recently i added a hard disk into the fron loading icydock thing i had. as i had the left hand most bay empty. So perhaps that affected the airflow in the case more? Either way last time i had an issue a week or two ago. which thankfully Jorge was around to assist with, i moved the raid card from a pcix8 slot to a x16 slot and moved the unused graphics card across. The issue was almost certainly the cable, or so i thought. as there were two ports on the breakout cable that would not connect any drive. so i figured bad cable. must be it. replaced it and been ok for a week. When i picked out the HBA card it was close to being too hot to touch. and i imagine moving to a 16x slot would have only made that worse. however. today after another couple of drives (different ones again) started to give me errors i replaced the drives reseated all cables again. and then tried touching the hba heatsink. its again extremely hot to the touch. After rebooting one drive had an unsupported formatting? (or somethign like that) error but for whatever reason everything was up and running. the data rebuild had started. but due to the issue i have had i decided to stop that and upgrade unraid before doing it. So now i updated to V7 of unraid before rebuilding. and i will bookmark this thread and report back. As pzg above said it could likely just be a coincidence. however my symptoms seem to match the ones you here enough to just make a note that i experienced the same thing.
March 25, 20251 yr Author 39 minutes ago, Driden said: Sorry to necro this old thread but i have been trying to find other possible reasons on why my drives suddenly started to get issues after having worked for a long time with no issues. I had been having issues with various drives suddenly having CRC issues. that have all worked fine for quite some time without any issue. I cannot recall the upgrade path or timings for when i updated to 6.12.3 but i know the CRC issues seem to have been a thing sice then. recently i added a hard disk into the fron loading icydock thing i had. as i had the left hand most bay empty. So perhaps that affected the airflow in the case more? Either way last time i had an issue a week or two ago. which thankfully Jorge was around to assist with, i moved the raid card from a pcix8 slot to a x16 slot and moved the unused graphics card across. The issue was almost certainly the cable, or so i thought. as there were two ports on the breakout cable that would not connect any drive. so i figured bad cable. must be it. replaced it and been ok for a week. When i picked out the HBA card it was close to being too hot to touch. and i imagine moving to a 16x slot would have only made that worse. however. today after another couple of drives (different ones again) started to give me errors i replaced the drives reseated all cables again. and then tried touching the hba heatsink. its again extremely hot to the touch. After rebooting one drive had an unsupported formatting? (or somethign like that) error but for whatever reason everything was up and running. the data rebuild had started. but due to the issue i have had i decided to stop that and upgrade unraid before doing it. So now i updated to V7 of unraid before rebuilding. and i will bookmark this thread and report back. As pzg above said it could likely just be a coincidence. however my symptoms seem to match the ones you here enough to just make a note that i experienced the same thing. For me in the end I just replaced the HBA and I havent had any issues since. I would definitely suggest having a fan blowing air on the HBA's if you feel its overheating. They tend to run fairly hot but they also expect server levels of airflow through them.
March 25, 20251 yr Yes. ill be adding a different case with much better airflow soon. ive been using an old 4u rack mount case for quite some time and it just isnt up to the job for airflow i dont think.
March 25, 20251 yr Author 3 minutes ago, Driden said: Yes. ill be adding a different case with much better airflow soon. ive been using an old 4u rack mount case for quite some time and it just isnt up to the job for airflow i dont think. Yea I also went with a newer HBA. 9500.. My logic was that its overkill for the task and maybe it wont get as hot! Probably absolute bull**** and I was just convincing myself to spend the money! Either way I haven't had a problem. It was very much plug and play and hasn't missed a beat.
March 25, 20251 yr Author 1 minute ago, MowMdown said: PSA for people with HBA's... Attach a Noctua 40mm fan directly to the heatsink. 100% or if like me you cant fit it directly then have a 120mm or 140mm fan directly above it blowing directly at the HBA.
March 18Mar 18 On 3/25/2025 at 7:24 PM, MowMdown said:PSA for people with HBA's... Attach a Noctua 40mm fan directly to the heatsink.Those HBAs can get surprisingly hot, especially in low airflow cases. A small Noctua fan is quiet and makes a noticeable difference in temps and stability.
March 18Mar 18 This type af HBA "need an active airflow" - otherwise, it gets more then 60°C Edited March 18Mar 18 by Zonediver
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.