Jump to content

brandon3055

Members
  • Posts

    55
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by brandon3055

  1. I really hope i have not misunderstood the planned ZFS implementation. When i hear "ZFS Support", Single device pools are not exactly the first thing that comes to mind. My plan this whole time has been to setup my main array as a RAID-Z2 with 8 disks. Please tell me thats actually going to be possible...
  2. True... But the plugin does not allow you to setup the main array with ZFS so i would need to sacrafice one of my drives or setup something like a USB flash drive as the array just so i can use the server. Which i may end up doing. But for now i think i am just going to continue using this as an excuse to procrastinate... Edit: I have not tested it but there is a plugin which provides some GUI support for the ZFS plugin.
  3. Hopefully not too much longer. I have been using the promise of ZFS support as an excuse to delay setting up my new server since before 6.11 was released. But i can only procrastinate for so much longer...
  4. Just wondering if anyone else has run into this issue. Its happend twice so far and the first time it resulted in a completely bricked VM. Even bootrec failed to recover the install. What heppens is i shut down the vm and it goes through the normal windows shutdown process and the display turnes off but the vm never actually stopps according to unraid. Even after half an hour it was still running and the hole time it has a single core pinned at 100%. The only option i am left with is to force stop the vm. This is a windows 10 vm with an RX580 booting from an unussigned device. Initially i thought it may be a gpu issue because i have had some otehr gpu issues related to the fact that i am passing though the "primary" gpu and after killing the vm i had to reboot the system to get video output. But i think thats just the usual AMD reset bug. The first time this happened it was just a normal shutdown. But the second time was a little interesting. I still haven figured out what it was but something on the vm was pinning the cpu so hard the system locked up and i had to send the shutdown command via the vm manager. But after a few minutes windows did manage to kill whatever was pinning the cpu and shut down at which point i ran into this issue.
  5. I'm not sure if this should be considering a bug report or a feature request so mods feel free to move it if it's in the wrong place. So long story short i recently upgraded my unraid system and in the process my pci configuration changed. Previously i had a network card in the top slot which was a PCIEx1. With my new config i have removed that network card and the top slot is now a PCIEx16 containing my LSI card. You can probably see where this is going... The other day i tried to start an old vm that i forgot i had previously passed the network card through to but now the pcie address that previously belonged to the network card belongs to the LSI card responsible for all the drives in my array. What happened next was many hours of of trying to figure out wtf just happened and then repairing my array. Fortunately i was able to avoid data loss but then i almost made the same mistake again. Because while i was initially troubleshooting why my array just imploded i discovered the LSI card was passed assigned to the vm and tried to unassign it. After i recovered the array i went back to the vm settings and the LSI card was no longer even a passthrough option so i assumed i had successfully unassigned it. But just to be safe i checked the XML and found that it was actually still assigned to the vm. I understand this is a very edge case issue that may not be solvable. But maybe unraid could keep a map of "pci address -> Device type" or something like that. Then give a warning when the device at an address assigned to a vm changes. Or even just do a sanity check each time you start a vm. Unraid obviously knows that LSI card should not be assigned to a vm so maybe when you go to start the vm it could stop you and be like "Hay this device probably shouldn't be assigned to a vm. Are you sure you want to continue?"
  6. Alright so the rebuild completed successfully and it looks like i managed to avoid any data corruption. I was not able to use UD for any of this due to the fact it sees my drives a zfs devices and there for can not mount them but i was able to do everything via the command line. In case anyone finds it useful here is a summary of what i ended up doing. First rebuild disk1 onto a spare drive. Then i shut down the array and unassigned the 'new' disk1 Next i regenerated the UUID for the new disk1 using the command provided by johnnie.black and mounted both the old and the new disk1 in read only mode like so. root@NAS:/# xfs_admin -U generate /dev/sdf1 root@NAS:/# mkdir /mnt/new-disk1 root@NAS:/# mkdir /mnt/old-disk1 root@NAS:/# mount -r -t xfs /dev/sdf1 /mnt/new-disk1 root@NAS:/# mount -r -t xfs /dev/sdg1 /mnt/old-disk1 Now to compare the contents of the drives i used rsync in dry run mode. I did this in 2 passes. The first just used the default compare function which is based on file time stamps. This gave me a list of files that were most likely modified by normal processes while the disk was being rebuilt. But it was unlikely to identify corrupt files. Its important to note the rsync direction is important old -> new because if there are any files missing completely tthey will be missing from the new drive and as a result would be ignored in a new -> old rsync. root@NAS:~# rsync -hanP /mnt/old-disk1/ /mnt/new-disk1/ sending incremental file list ./ appdata/binhex-delugevpn/deluge-web.log appdata/binhex-delugevpn/deluged.log appdata/binhex-delugevpn/supervisord.log appdata/radarr/nzbdrone.db appdata/sonarr/logs.db-shm appdata/sonarr/logs.db-wal appdata/sonarr/nzbdrone.db system/docker/docker.img system/libvirt/libvirt.img Everything there looks normal. These are all files i would expect to be modified while the array is running. Now for the second pass i used rsync in checksum mode compares the entire contents of each file. This WILL detect if any files have been altered due to corruption. So what i was hoping to see with this command is the exact same output as the first command. Because this indicates that all of the detected changes are 'normal' changes and unlikely to be caused by corruption. Fortunately this is exactly what i got. root@NAS:~# rsync -hancP /mnt/old-disk1/ /mnt/new-disk1/ sending incremental file list ./ appdata/binhex-delugevpn/deluge-web.log appdata/binhex-delugevpn/deluged.log appdata/binhex-delugevpn/supervisord.log appdata/radarr/nzbdrone.db appdata/sonarr/logs.db-shm appdata/sonarr/logs.db-wal appdata/sonarr/nzbdrone.db system/docker/docker.img system/libvirt/libvirt.img It's worth noting this second pass took several hours to complete as it had to read every file on both disks. This indicates that both the old drive and the new drive are most likely completely intact with no file corruption. If ether or both drives had random corruption i would expect to see other files in this list. And if that was the case i would need to inspect both versions of each file determine which of the disks has the most valid data and figure out where to go from there. So at this point i knew all of my data was most likely intact and i had 2 option. Ether use new config to 'import' the old disk 1 then update parity my good parity disk followed by a rebuild of my second parity disc (at this point i dont think it would be worth importing) Or simply repeat the rebuild process (which i now know works) and rebuild onto the old disk 1. Followed by a rebuild of my second parity disk. I chose option 2 as its the option that i now know should work and if it does not i now have a backup of disk1. It's worth noting that if i did not need the new drive for another system i could have just left it in place and kept the old disk 1 as a spar.
  7. Ok rebuild is in progress. I guess i will get back to you in around 4 to 7 hours to let you know how everything went. Thank you for all the help!
  8. As a temporary "in case this does not work" option or as a permanent solution to my problem? Because I do have a spare 3TB drive but i have other plans for that drive.
  9. As far as i can tell everything is there but its really impossible to know for sure. I'm wondering if it would be a good idea to start the array one last time and backup the "emulated" contents of the disk. because i'm guessing that wont be an option once i run new config.
  10. Ok so before i attempt to mount this. Does this look normal to you? I expected file system to be xfs not zfs_member I did get an odd error when i tried to mount the drive from the command like in order to back it up. root@NAS:/# mount /dev/sdf1 /mnt/recovery mount: /mnt/recovery: more filesystems detected on /dev/sdf1; use -t <type> or wipefs(8). As a result i had to specify the file system manually in order to mount the drive. root@NAS:/# mount -r -t xfs /dev/sdf1 /mnt/recovery Is this normal or am i looking at possible file system corruption? Edit: This drive was part of a ZFS pool before i switched to unraid and reformatted everything.
  11. Yea i figured out i was barking up the wrong tree about 2 minutes after my last post when google turned up a mention of the "New config utility" I guess there is probably no point finishing this backup before i attempt this. Especially since rsync is currently estimating 3700 hours for a single 90G file. Here's hoping that not due to an issue with the drive...
  12. When you say do a 'new config' what exactly does that mean? I did some digging and i have pieced together a few things but i never found a definitive set of instructions explaining the process. From what i have gathered it sounds like i would just backup my license key then essentially do a fresh install of unraid. Then as long as i re assign all my disks to the same slots my data would be saved and parity would be rebuilt. But if i'm not mistaken that would also nuke pretty much everything else on the system docker, vm's, shares etc. So what exactly are you suggesting i do? I currently have disk 1 mounted read only and am backing up its contents which looks like it will take a while so i have some time to figure this out.
  13. So i think i know exactly what happened. Before the system upgrade i had a pci network card passed through to the vm. I removed that card when i did the upgrade. And i'm guessing the LSI card's new pci id after the upgrade just happened to line up with the id of the old network card... So when i tried to start the vm it did try to pull the LSI card away from unraid which obviously caused all hell to break loose. And it makes sense that disk 1 was affected because disc one is where i store appdata, system, etc so it would have been active at the time and by extension parity would have also been active. This leads me to question the integrity of the remaining "good" parity disc which is being used to emulate disk 1. Now that i am pretty sure i know what happened i think my best bet is to backup the contents of disk one (if someone can tell me how to do that) then rebuild the disk and compare it to the backup. But before i do anything i would like a second opinion.
  14. So my nas consists of 2 120GB cache ssd's connected directly to the onboard sata controller and 6 3TB HDD's connected via a reflashed LSI SAS9220 card. Its been running flawlessly for several years now. A couple weeks ago i upgraded it from an FX-8320 to an i7-4770S and it continued to run flawlessly... until just now. So here is what happened. As far as i could tell everything was running fine before this happened. I went to start up a VM that i have not used since the upgrade and unraid complained that the installation iso was not available (that should have been my first hint that something was wrong) But i didn't think much of it at the time so i just went into the vm settings and removed the iso as its no longer needed. Then as i pressed update i noticed unraid had automatically assigned the LSI card to the vm as a passthrough device... The same LSI card that the drives are attached to. It took me a second to realize because normally that isn't even a passthrough option. I tried to unassign the device but at this point i was no longer able to make changes to the vm. The update process would just hang indefinitely. Around this time i got a notification that "6 drives in the array have errors" So i gave up on the vm and tried to stop the array. But the stop did nothing. So next i disabled auto start in disk settings and told unrad to reboot which did work. So thats what happened. I still don't understand it but now i need to deal with the aftermath... After the reboot everything "seemed" fine so i started the array. Unraid decided this would be a good time for a parity check so i immediately canceled that because i was still unsure of the state of the array. I then went back to the vm to unassign the LSI card and found that the LSI card was no longer even a passthrough option which confirmed that it never should have been an option in the first place. I then tried to start the vm again only to find that the vm image was not available. It was at this point i discovered 2 of my drives had been kicked from the array. one of my parity drives and my first data drive which just so happens to store my vm images among other things. I tried to view the contents of drive 1 via the unraid ui to see if the contents were emulated but all i got was an empty folder. So next i rebooted unraid again after which the disk appeared to be successfully emulated. I decided to check the contents of the shares to make sure everything was there. I started by opening the movies folder in my media share (which is actually on a completely different disk) but what i got was the contents of a completely different folder. I'm not sure if it was a folder in the same share or from a completely different disk. I checked the same folder (movies) via the unraid web ui and the contents looked normal. So i figured maybe it had something todo with the fact that i had some of the shares manually mounted in linux before i rebooted unraid (and they were still mounted) So i unmounted those locations and rebooted unraid again. As far as i can tell that fixed the problem. I have not checked every share my my movies folder now contains movies again. And so that's where i am at. I can now theoretically re assign those 2 drives and rebuild them but given everything that just happened i have decided to just stop the array and seek some professional help before i start rebuilding disks. At the very least i would like to somehow mount disk 1 and backup its contents first if that's possible. I have attached diagnostics from after the initial reboot as well as the diagnostics as of right now. I'm not sure if you need both but as i have them i figured it couldn't hurt. I initially thought this was caused by unraid assigning the LSI card to the vm but considering the vm initially failed to start because of a missing file i'm guessing disk 1 was gone before i tried to start the vm. Unless the vm tried to pull the lsi card from unraid when i initially tried to start it. I just have no idea why unraid tried to give the LSI card to that vm... nas-diagnostics-20200511-1306.zip nas-diagnostics-20200511-1428.zip
  15. So this is an issue i have been living with for a while but its really starting to annoy me so i am wondering if anyone knows a fix. I don't have a great understanding of the linux permission system so please forgive me if this has an obvious fix. The issue is with my media share which is accessible to all users and guests. I run ubuntu as my main os and i have the media share mounted with nfs like so 192.168.1.133:/mnt/user/Media\040Library /mnt/share/MediaLibrary nfs defaults,nofail 0 0 The issue is whenever i create a new folder in the media share its permissions are set as `rwxr-x---` with owner 99 (nobody) and group 'users' As a result no one else on the network (Windows users) can access the folder. This only seems to be an issue with nfs. Folders created via SMB are owned by the user that created them and have permissions `rwxrwxrwx` Has anyone encountered this issue before or know what i am doing wrong? The way i currently get around the issue is by running 'Docker Safe New Perms' every time i add something to the media share but thats getting rather annoying.
  16. Oh... Ok... so it does give "an error" but its just a notification... I dont pay much attention to those because i get a lot of them for system status and what not. (i may want to adjust my notification settings) I kinda expected the drive to go red and display some "device missing" message or something. Instead other than that notification it just pretends nothing is wrong...
  17. Ok so looks like your rite its the power connection. I swapped the sata cable with another drive and nothing happened. So then i swapped the power cables and suddenly the drive i swapped the cables with was not detected. It makes sense i ran out of sata power plugs on my psu so i hacked in some additional plugs from a cheap molex to sata converter (very cheap apparently) This answer just didnt make much sense because the issue only occurs when the array is offline... Unless i am wrong in assuming unraid will throw all sorts of errors at me if a cache drive is lost while the array is online?
  18. So i have an SSD that keeps vanishing from unraid... This is one of 2 identical Kingston 120GB ssd's that i have configured as a btrfs cache pool. These are both connected to a cross flashed IBM ServeRAID M1015 card. (they are the only 2 drives currently connected to that card) That cache pool had never had a problem until a couple months ago when after rebooting the box one of the drives randomly became unassigned. I could not figure out what happened so i just re assigned the drive and started the array. That wiped the drive but its a redundant pool and there wasn't anything important on it at the time anyway. Since then the array has been running continuously and has not had a single issue with that drive. Then yesterday i started messing with some unrelated network issues that required me to take the array offline several times. This is where i started running into issues again. On several occasions the instant i took the array offline the drive would just vanish I found that there are a number of ways i can get it to come back. Sometimes it would just come on its own without me doing anything. I would just refresh the page and it would be there (I suspect thats what happened when it "unassigned" itself a couple months ago) Other times i would need to reboot at least once but sometimes several times before it would show up again. And i also found that simply plugging the drive and plugging it back in again would also almost always fix it. I tried using different connector on the SAS breakout cable but that made no difference. Also sometimes the drive would show up like this. (It did not assign itself like that i did so i could take the screen shot) I find this rather confusing... It sounds like a bad ssd but if that was the case why would it only ever show problems when i stop the array? Im sure i could start the array rite now and as long as i dont ever stop it im sure it will still be running fine in 6 months. The other possibility is my cross flashed LSI card but i suspect that would effect both drives, There is also the ebay SAS breakout cable to consider but the other drive is connected to the same cable and i tried all 3 remaining plugs on that cable. They all have the same issue. The power connection is also a little sketchy but i dont think this is an issue with any physical connections because last i checked electrical faults dont wait for you to press "stop array"
  19. Figured out a solution to this issue over on the pfsense forum. https://forum.netgate.com/topic/139027/pfsense-blocks-static-and-dhcp-ip-requests-from-unraid-when-bridging-is-enabled/13 TLDR: the pfsense mac address for the LAN interface defaults to "00:00:00:00:00:00" and unraid does not like that. I set a proper mac address and now everything works fine. Edit: Looks like this is more likely an issue with my LAN nic not pfsense.
  20. You are correct. Why i said "pfSense blocks static and DHCP" in the title. The reason it took me so long to to figure out that this is a pfsense issue is i was always using a static ip and it simply wasn't working and i had no idea why. I only started piecing things together when i decided to try DHCP for the hell of it and noticed it was being assigned 169.254.14.54. Some more research and i discovered thats a fallback ip used when dhcp fails. That eventually lead me to take a look at the pfsense logs where i found that unraid is being blocked. After thinking about i guess its possible unraid is using the normal ip for its DHCP request and failing. What i am seeing the the pfsense log may just be a result of unraid trying to access the network with its fallback IP after dhcp has already failed. In any case I can only assume whatever is preventing unraid from getting a response from DHCP is the same thing that is breaking static ip assignment.
  21. So i have been banging my head against the wall for months now trying to figure out why unraid's networking just implodes when i turn on bridging. Eventually i concluded it must be an issue with the Realtek nic i'm using. I finally switched to a nic from the unraid recommended hardware list and... same problem. After a lot more debugging i found the issue. I dont know much about this topic but my digging i have discovered that when most devices request an ip via DHCP they send a request to 255.255.255.255 with a source of 0.0.0.0. I assume this is what unraid does when bridging is disabled because everything works. However when i enable bridging unraid switches to the "link local" domain (169.254.x.x) it sends its request to 169.254.255.255 with a source of 169.254.14.59. Now i dont know if this is normal or a bug. Other routers seem ok with this but it turns out pfsense just completely blocks all traffic on the link local address thereby preventing unraid from ever getting a DHCP response. This fores unraid to use a fallback link local address (196.254.14.59 in my case) It even shows in the pfsense log that the request is being blocked Notice the source IP matches the ip unraid falls back to. Also a bunch more entries show up every time i connect the unraid box to the network so there is no doubt that this unraid being blocked. Unfortunately all my research has told me that the rule that blocks link local traffic is hard coded into pfsense so i dont think there is anything i can do on that end to work around that. This also may or may not have something todo with the fact that pfsense is not connected directly to the internet but instead to a dmz that eventually leads to the internet.
  22. Thanks for the help jonp i will have to see if i can find a nic or a board with an intel nic to test that. The nic currently install is an ad-in card as i was curious if using a different nic would solve the issue. But on closer inspection it seems the builtin nic is also Realtek so replacing realtek with another realtek really doesn't tell me much. Edit: Will any old intel nic work or should i be looking for something specific? The cheapest intel nic available to me is already twice the nic i have installed https://au.pcpartpicker.com/product/xsdqqs/intel-wired-network-card-expi9301ctblk
  23. Here are the diagnostics from the box. This probably should have been included in my first post. nas-diagnostics-20180811-1640.zip
  24. I didnt think of that. I was thinking more along the lines of telling unraid to use the second nic as its primary but that is a much simpler solution. Unfortunately that still didn't fix the issue so its probably not the builtin nic. If anyone else has any ideas i am open to suggestions. Here is a screenshot of my network configuration and what happens when i turn on bridging.
  25. So it just occurred to me that maby this is a problem with my boards built in NIC because no one else seems to be having this problem. If this is the case the solution should be as simple as disabling the built in nic and using the ad-in card as the primary nic. The only issue is i dont see a way to do this from the web UI. Is this something that can be done via the command line?
×
×
  • Create New...