tazire

Members
  • Posts

    375
  • Joined

  • Last visited

Everything posted by tazire

  1. One thing I didn't see mentioned which is a big big plus for me about the array setup is that if you lose a drive beyond your parity drives you only lose the data on that one drive. You can just replace the drive and replace the data (assuming its not irreplicable) lost on only that drive. All the data on the other drives remains and is still usable. ZFS you will obviously lose everything should you lose that 3rd drive.... as unlikely as that maybe. In my case I never need the raw read speed of proper zfs (opposed to array zfs) for my media. And as long as there is not several streams from the same drive there is plenty of read speed available. And theoretically if I get multiple streams to different drives the cumulative array output speed can actually be very high (30 drives working at max output of 150-200MB/s say a conservative 100MB/s). Write speed is obviously limited to the speed of a single drive due to parity but this can be overcome with cache drives. But I dont put anything on my array that needs that raw write speed either. In the end I made multiple pools for data that I decided needed the speed and general benefits of zfs and then used the array for the easily replaceable media. I would say I am in a very different position to most in terms of the Hardware I'm running so these considerations might not be an option for everyone but definitely something to consider.
  2. will do. Any idea why the MAC address isnt populating on the network settings? I assume this is just a completely different issue and not related?
  3. Just looking for a bit of help with an FCP alert in relation to macvlan/bridging being active. So I made this post on general support about it. Essentially no matter what settings I have active it will give me the alert about macvlan/bridging being active. I have tried macvlan with bridging active which obviously gives the alert, then macvlan with bridging disabled which again gave the alert. macvlan with bridging disabled and making sure no containers have bridge set, and finally ipvlan active which will still give me the alert about macvlan and bridging being active. I made sure to reboot after each settings change. Everything seems to be working just fine with the exception of the MAC address not populating on the network settings page. Pic of this available on the above link along with diagnostics. any help would be appreciated.
  4. Yea I did. I just rebooted again there now just to double check and still showing the error.
  5. I've been doing some house keeping with my server recently. I didn't want to get into it while it was working well tbh. I've been using macvlan/bridging for as long as I can remember. I never had the crashing issue people reported but I did have some strange issues with network dropping out if I started to download through a VPN... might be a completely separate issue but just kind of assumed it was related. It never bothered me as it would return after 5 or 10 mins and everything would work as normal. So I had the FCP alert ignored. In the last few days I started to try address this alert. I followed the recommended settings from the docs. Firstly, as I have ubiquiti networking gear I tried to disable bridging and keep macvlan. FCP would continue to give me the alert saying that macvlan/bridging was active. I then went through each container to make sure they weren't set to bridging assuming that was causing it... still the alert from FCP. I then changed to ipvlan and turned back on bridging and still FCP persists in giving the alert. I will say that all containers have run perfectly in each setting. On a side note I have spotted in the network settings I noticed that the MAC address doesn't populate. Again this might have nothing to do with my issue. I'm on 6.12.8 but this issue has persisted since 6.12.6. Any help would be appreciated. server-diagnostics-20240216-1628.zip
  6. Ah ok... Yea it seems that kasm is the issue... Thankfully its not vital for my needs and was just for playing with and being able to access blocked stuff in work. Thanks for your help m8.
  7. How do I figure that out? Sorry never had to do it based off this information!
  8. @Michael_P No i've no firefox container. Unless another container has it included. The only recent addition I've made was linuxservers kasm. But I dont leave any workspaces active. I have the settings such that it deletes the workspace after a certain amount of time inactive. But as a rule I will manually delete workspaces after I'm finished anyway.
  9. Just need a bit of help identifying what could be causing this. I have attached my diagnostics. This is the 2nd time its happened. It doesnt happen every night but its the 2nd time over the last 6 months or so. The timing of the alert is usually between 4am and 5am. This is when appdata back up is running. This may or may not have anything to do with the issue. Usually by the time I wake up its resolved itself. Just curious if anyone can help. Cheers. EDIT*** I suspect the issue might be duplicacy. I think I changed the run time of appdata backup to be slightly later which possibly is causing an overlap in those backups. ie appdata backup is still running when duplicacy is starting its backup to B2. I'll try change the times to make sure this overlap is gone to test it. But if anyone can spot from the logs what the issue is that would be great. server-diagnostics-20240209-0904.zip
  10. Very nice updates. I'm using that cooler and 2 of the Asus hyper cards as well. I've been half looking at an a4000 myself. Currently using a p4000. But I'm also tempted to hang on a while and get an ada generation a4000 for av1 support when the prices drop a bit. The p4000 is plenty for the time being anyway. I added a chenbro 24 bay 2u chassis for ssd's and moved everything to my spare room and put it in a lack rack! Eventually I'll move it out to the shed but I need to insulate and run power and networking before I can do that.
  11. After a recent update I started to get the following error... The info on the link doesn't really address the issue. It mentions ocs-provider rather than ocm it mentions in the error. Very little info on google about the issue. I use NPM as my RP, not sure if that's the issue. Any help would be appreciated. Again this was never an issue until one of the recent updates.
  12. The particular expander allows for up to 3 connections. Would this make any difference?
  13. Curious if anyone knows anything about an LSI SAS3x36R expander? I have been looking at a 2U 24bay chassis that uses this expander with the intention of using it as a JBOD to connect SSD's to my server. I'm just trying to make sure that it will have the throughput to use SSD's. Fairly sure it should be absolutely fine during normal use but in a worst case if I'm hitting all 24 drives at once what's the theoretical throughput? EDIT* I would use a 9500-16e HBA to get 3 connections and PCIE4 link to HBA.
  14. Thanks for the response. Just as you responded I had rebooted the server and the drive is back to normal operation and I can add it as a cache pool now. cheers.
  15. Weird behaviour on an nvme drive. So I did have proxmox installed on this drive but I now want to clear the drive and use it as a cache drive on my test server. The drive appears twice on UD both at 8.59GB (512GB drive). When I try to clear or format the drive it just fails. Is there anything I can do to force clear/format the drive to get it back working as it should? I have a feeling this might just be whatever filesystem proxmox has on the drive causing the issue. Probably nothing to do with UD as such but any help would be great. tower-diagnostics-20230822-2339.zip
  16. I've gone with the 9500-16i again though without the RAM. I havent seen anything anywhere about people having to flash the firmware to IT mode and was just guessing it came in IT mode. Cheers for the quick response.
  17. @Pri Just a question on the 9500... Is IT mode a thing on the newer HBA's? I keep getting errors with my 9300-16i so I've invested in a 9500 16i. Havent seen it mentioned anywhere and I'm guessing the newer HBA's work out of the box?
  18. Looking for a bit of help removing an NFS share using command line. The server that was running a remote share shut down due to a power outage.. The share is still mounted in UD. When I clicked the unmount button in UD it locked up the GUI completely. I can SSH into the server... so i'm just looking for the commands to force unmount the share or force remove it ideally. I dont want to just try reboot on the off chance this hangs the server completely. I'd like to avoid a hard reset if possible. EDIT Found my answer. umount -l /pathofmount/ Sorted the issue and GUI was immediately accessible again.
  19. Yea I already have 0.5m cables. And air flow itself is fine from what I can gather. The 24 hard drives are sitting at 30-35 degrees. The 10nvme's are sitting at 35-45 degrees. And just to be on the safe side I put a 3000rpm fan sitting over the HBA's in the system blowing straight down on them. I think my only option at this point is to try a different HBA. Thanks for the help. Its just the fact that its worked from Feb to Jun with 0 issues until I updated the system. I know that is likely just coincidence but there is also a part of me that is wondering is there some system change that is taxing the HBA more as a result. The HBA should ofc be able to handle it... i'm just curious as to why its been perfectly fine for so long. Anyway thanks for your insight. Hopefully a HBA change will sort my issue. I wont get to do it for a while... for now at least if I dont run a parity check I get no issues.
  20. Yea I was aware of that issue when I was ordering I checked the serial number when I got it and it was a legit SN. But again thats assuming they didnt copy the SN of an original. As I said though it has worked absolutely perfectly since Feb. Its only causing issues since I updated to 6.12.x. It was well stressed out and gone through multiple parity checks prior to the update without any issues what so ever. It has been much hotter here recently and thought that maybe the update was just a coincidence but tbh its not that hot today 16 degrees so I didnt expect it would be an issue. I dont know why it would all of a sudden start having issues when it was rock solid for so many months. I was just making the assumption it was overheating because so many drives were having the issue and only when stress has been put on the HBA. I was thinking of bumping up to a higher end HBA so that it wouldnt be stressed as much by spinning rust drives during parity checks in the hope that it wouldnt get as hot but I'd like to troubleshoot this one as best I can before I invest more money into it. Also I have no experience with them so have no idea if my theory on them running cooler is true. EDIT: it was marked as used/refurbished I believe. I bought it from a crowd in Germany in the hope of avoiding the Chinese knock offs. But you never know i guess.
  21. I just think it is... as i already said I have re-seated the cables and the errors went away under normal use. But when I started a parity check after a couple of hours it started to spit out the errors again. And now that I have stopped the parity check its back to normal use. Its just the only thing I can think of thats causing it at this point. The hardware was running on 6.11.5 since February with no issues... went through a few parity checks. It has gotten a lot warmer since then ofc. So its the only thing I cant think of at this point.
  22. I built the server in February and cleaned it out yesterday and re seated all the cables while I was at it. Server is clean as a whistle. The 9300 is an ebay buy so it's possible the heatsink could do with a re paste. But I'd rather not get into doing that. It's just strange timing that it happens with the update.
  23. This only seems to be an issue since I updated to 6.12.x but I have a feeling that is just a coincidence and my HBA is struggling with the heat. So after a few issues since updating to 6.12.0 then to 6.12.1 I thought I had them ironed out but apparently not.... In order to update to 6.12.1 I had to remove my usb to make the update due to gui not being available on 6.12.0. Following the update I was starting to get CRC errors across all my drives. I assumed this was due to a wire connection issue probably caused by the vibration of me taking out the usb (server in an awkward spot and had to be moved slightly to get at the usb). I since moved my server fully to a more accessible location and reconnected all my SAS wires. This seemed to solve my issue. No errors at all. Due to other issues I started a parity check and now 4 hours in I'm starting to get the crc errors again. They are across several drives. Its small amounts on most and none above 1000 that I noticed. I have a 9300-16i HBA and I'm concerned it might be overheating and causing the issue? I do have a 3000rpm fan firing straight down on the HBA and the other cards beside it. For a long term solution, if this is the case, would a newer HBA like a 9400 or 9500 run at lower temps? I have included diagnostics while parity is still running. I'm going to pause the parity for now and schedule it to run for shorter periods over night to try mitigate the issue for now. server-diagnostics-20230625-1314.zip
  24. Thanks @JorgeB I'll do this when I re add them. Thanks again.
  25. I've added 3x4TB WD Reds they cleared just fine but now when I go to format the drives I get the error "Unmountable: Unsupported or no file system". My logs show an issue with invalid superblock number. At present the array is up and running just fine with no issues other than the 3 drives sitting there in that status. These 3 drives were part of the array a long time ago and were upgraded to bigger drives due to lack of drive bays but I since I added a JBOD I wanted to try add them back. Since they get fully wiped during the clear I doubt this is the issue? Is this likely a sign of a bigger issue? Included diagnostics. Any help would be greatly appreciated. Also this is the first drives I've tried to add since the update to 6.12.x. I did have some difficulty following the update which appeared to be resolved before I tried this. I have since been able to remove the drives from the array and its back running as normal. I'm running a parity check on it atm just to make sure the party wasnt effected by this. server-diagnostics-20230624-2103.zip