
tazire
Members-
Posts
365 -
Joined
-
Last visited
tazire's Achievements
-
After a recent update I started to get the following error... The info on the link doesn't really address the issue. It mentions ocs-provider rather than ocm it mentions in the error. Very little info on google about the issue. I use NPM as my RP, not sure if that's the issue. Any help would be appreciated. Again this was never an issue until one of the recent updates.
-
Curious if anyone knows anything about an LSI SAS3x36R expander? I have been looking at a 2U 24bay chassis that uses this expander with the intention of using it as a JBOD to connect SSD's to my server. I'm just trying to make sure that it will have the throughput to use SSD's. Fairly sure it should be absolutely fine during normal use but in a worst case if I'm hitting all 24 drives at once what's the theoretical throughput? EDIT* I would use a 9500-16e HBA to get 3 connections and PCIE4 link to HBA.
-
Weird behaviour on an nvme drive. So I did have proxmox installed on this drive but I now want to clear the drive and use it as a cache drive on my test server. The drive appears twice on UD both at 8.59GB (512GB drive). When I try to clear or format the drive it just fails. Is there anything I can do to force clear/format the drive to get it back working as it should? I have a feeling this might just be whatever filesystem proxmox has on the drive causing the issue. Probably nothing to do with UD as such but any help would be great. tower-diagnostics-20230822-2339.zip
-
I've gone with the 9500-16i again though without the RAM. I havent seen anything anywhere about people having to flash the firmware to IT mode and was just guessing it came in IT mode. Cheers for the quick response.
-
@Pri Just a question on the 9500... Is IT mode a thing on the newer HBA's? I keep getting errors with my 9300-16i so I've invested in a 9500 16i. Havent seen it mentioned anywhere and I'm guessing the newer HBA's work out of the box?
-
Looking for a bit of help removing an NFS share using command line. The server that was running a remote share shut down due to a power outage.. The share is still mounted in UD. When I clicked the unmount button in UD it locked up the GUI completely. I can SSH into the server... so i'm just looking for the commands to force unmount the share or force remove it ideally. I dont want to just try reboot on the off chance this hangs the server completely. I'd like to avoid a hard reset if possible. EDIT Found my answer. umount -l /pathofmount/ Sorted the issue and GUI was immediately accessible again.
-
Yea I already have 0.5m cables. And air flow itself is fine from what I can gather. The 24 hard drives are sitting at 30-35 degrees. The 10nvme's are sitting at 35-45 degrees. And just to be on the safe side I put a 3000rpm fan sitting over the HBA's in the system blowing straight down on them. I think my only option at this point is to try a different HBA. Thanks for the help. Its just the fact that its worked from Feb to Jun with 0 issues until I updated the system. I know that is likely just coincidence but there is also a part of me that is wondering is there some system change that is taxing the HBA more as a result. The HBA should ofc be able to handle it... i'm just curious as to why its been perfectly fine for so long. Anyway thanks for your insight. Hopefully a HBA change will sort my issue. I wont get to do it for a while... for now at least if I dont run a parity check I get no issues.
-
Yea I was aware of that issue when I was ordering I checked the serial number when I got it and it was a legit SN. But again thats assuming they didnt copy the SN of an original. As I said though it has worked absolutely perfectly since Feb. Its only causing issues since I updated to 6.12.x. It was well stressed out and gone through multiple parity checks prior to the update without any issues what so ever. It has been much hotter here recently and thought that maybe the update was just a coincidence but tbh its not that hot today 16 degrees so I didnt expect it would be an issue. I dont know why it would all of a sudden start having issues when it was rock solid for so many months. I was just making the assumption it was overheating because so many drives were having the issue and only when stress has been put on the HBA. I was thinking of bumping up to a higher end HBA so that it wouldnt be stressed as much by spinning rust drives during parity checks in the hope that it wouldnt get as hot but I'd like to troubleshoot this one as best I can before I invest more money into it. Also I have no experience with them so have no idea if my theory on them running cooler is true. EDIT: it was marked as used/refurbished I believe. I bought it from a crowd in Germany in the hope of avoiding the Chinese knock offs. But you never know i guess.
-
tazire started following HBA overheating?
-
I just think it is... as i already said I have re-seated the cables and the errors went away under normal use. But when I started a parity check after a couple of hours it started to spit out the errors again. And now that I have stopped the parity check its back to normal use. Its just the only thing I can think of thats causing it at this point. The hardware was running on 6.11.5 since February with no issues... went through a few parity checks. It has gotten a lot warmer since then ofc. So its the only thing I cant think of at this point.
-
I built the server in February and cleaned it out yesterday and re seated all the cables while I was at it. Server is clean as a whistle. The 9300 is an ebay buy so it's possible the heatsink could do with a re paste. But I'd rather not get into doing that. It's just strange timing that it happens with the update.
-
This only seems to be an issue since I updated to 6.12.x but I have a feeling that is just a coincidence and my HBA is struggling with the heat. So after a few issues since updating to 6.12.0 then to 6.12.1 I thought I had them ironed out but apparently not.... In order to update to 6.12.1 I had to remove my usb to make the update due to gui not being available on 6.12.0. Following the update I was starting to get CRC errors across all my drives. I assumed this was due to a wire connection issue probably caused by the vibration of me taking out the usb (server in an awkward spot and had to be moved slightly to get at the usb). I since moved my server fully to a more accessible location and reconnected all my SAS wires. This seemed to solve my issue. No errors at all. Due to other issues I started a parity check and now 4 hours in I'm starting to get the crc errors again. They are across several drives. Its small amounts on most and none above 1000 that I noticed. I have a 9300-16i HBA and I'm concerned it might be overheating and causing the issue? I do have a 3000rpm fan firing straight down on the HBA and the other cards beside it. For a long term solution, if this is the case, would a newer HBA like a 9400 or 9500 run at lower temps? I have included diagnostics while parity is still running. I'm going to pause the parity for now and schedule it to run for shorter periods over night to try mitigate the issue for now. server-diagnostics-20230625-1314.zip
-
Thanks @JorgeB I'll do this when I re add them. Thanks again.
-
I've added 3x4TB WD Reds they cleared just fine but now when I go to format the drives I get the error "Unmountable: Unsupported or no file system". My logs show an issue with invalid superblock number. At present the array is up and running just fine with no issues other than the 3 drives sitting there in that status. These 3 drives were part of the array a long time ago and were upgraded to bigger drives due to lack of drive bays but I since I added a JBOD I wanted to try add them back. Since they get fully wiped during the clear I doubt this is the issue? Is this likely a sign of a bigger issue? Included diagnostics. Any help would be greatly appreciated. Also this is the first drives I've tried to add since the update to 6.12.x. I did have some difficulty following the update which appeared to be resolved before I tried this. I have since been able to remove the drives from the array and its back running as normal. I'm running a parity check on it atm just to make sure the party wasnt effected by this. server-diagnostics-20230624-2103.zip