Jump to content

CorserMoon

Members
  • Posts

    58
  • Joined

  • Last visited

Posts posted by CorserMoon

  1. On 3/15/2024 at 4:29 PM, FearlessAttempt said:

    I had this same issue. Leaving Runtime left to initiate shutdown (minutes) value blank seemed to be causing it.

    Cam here  to say that this fixed it for me.

  2. So I manually deleted many gigs of data off the drive, but free space according to the GUI didn't change, still 279GB free. I tried running Mover but it didn't seem to start because there is still data sitting on the cache drive that is configured to move onto the array when mover is invoked. I then rebooted the server and the free space didnt change and the files that I deleted are back. I am stuck and don't know what I am doing wrong. 

     

    EDIT: At this point it seems to make sense to reformat the pool (since I have the backup from the Backup/Restore Appdata plugin). Is there a guide on how to do this? And I also have the issue of the missing cache drive so not sure how to knock the cache pool back down to 1 drive again (it wont let me change the number of devices from 2 back to 1). Or maybe a better idea to just pop in a replacement ssd so I'm back up to 2 drives first and then reformat the pool?

     

    Additional weird observations:

    • As stated in my OP, I was also trying to add new drives to the array. At that time I added them but paused the disk-clear when I noticed issues. I've since removed the new disks, returning those array slots to "unassigned" but now every time I reboot the server, all those drives are back and disk-clear starts! 
    • I tried using one of the aforementioned HDDs to replace the missing cache drive and provide additional space so hopefully btrfs would be able to balance but cache pool still mounting as read-only and I received a new error: Unraid Status: Warning  - pool BTRFS too many profiles (You can ignore this warning when a pool balance operation is in progress)
  3. 11 minutes ago, JorgeB said:

    It's the way btrfs works, it would take more time than I have now to explain here, you can Google it, it's easy to find.

     

    No.

     

    If it just flashes and disappears you can ignore.

     

    Thanks so much for your help.

     

    Last questions for now: Would it make sense that 1 of the cache drives dying would lead to this full allocation issue? Could it be resolved by just replacing that 1 dead drive?

     

    I'm just trying to figure out if I have 1 issue or multiple different issues. 

  4. 13 minutes ago, JorgeB said:

    It's not full, 

     

    But the result is the same, it won't be able to write any data until there's some free space to allocate new metadata chunks.

     

    That's strange, works for me:

     

    image.png

     

    You need to free up some space first or the balance will likely fail.

     

    So what is the difference between allocation and free space? What would cause allocation to fill and is there a way to monitor for that? It's just weird  that all this starteed happening after one of the cache drives just disappeared. Would full allocation cause this? 

     

    I also just noticed that when the array is stopped and I am assigning/un-assigning disks, this error sporadically pops up briefly then disappears:

    image.png.2721c06145e43a6144145529955cd156.png

     

    EDIT: I tried to start the Mover process to move any extraneous data of the cache drive but the mover doesnt appear to be starting. 

  5. 1 hour ago, JorgeB said:

    Pool is missing a device and the other is fully allocated, so it's going read-only, disable docker/VM services and reboot, if the pool doesn't go immediately read-only you need to free up some space and rebalance, see here:

     

    https://forums.unraid.net/topic/62230-out-of-space-errors-on-cache-drive/?do=findComment&comment=610551

     

     

    I don't think it actually is full though. The "Super_Cache" pool has 2 1TB drives (super_cache and super_cache 2). 1 disappeared (aka missing) but everything was working fine after I acknowledged that it was missing, since the drives were mirrored (1TB actual space). I was having no issues with docker until this morning. I monitor that capacity closely and they were ~70% full before all this happened. GUI currently shows the remaining drive (super_cache 2) w/ 279GB free space.

    image.thumb.png.e6d5cee44405fe8859991a403684b6ba.png

     

    Strangely, du -sh super_cache/ shows total size of 476GB. But regardless, it shouldn't be full. 

     

    side note, that link throws this error: You do not have permission to view this topic.

  6. I recently dismantled a secondary, non-parity protected pool of several hdds. 2 of these drives are to replace the existing single parity drive of array and the remaining to be added to array storage. I have run into a lot of cascading issues which has resulted in the docker service not starting.  Here is the general timeline:

     

    • Stopped array in order to swap a single 12tb parity drive for 2x14tb parity drives. As soon as the array stopped, one of my 2 cache drives (2x1tb nvme, mirrored) disappeared. Shows missing and not in disk dropdowns. My first thought is that it died. 
    • Immediately restarted the array (without swapping the parity drives) and performed a backup of the cache pool to the array via the Backup/Restore Appdata plugin. Completed successfully. Everything, including docker, working normally. 
    • Ordered new nvme drives to replace both. 
    • Stopped array and successfully replaced swapped parity drive as outlined earlier. Parity rebuilt successfully. 
    • Stopped array to add remaining HDDs to array storage. Added, started array, and disk-clear started automatically as expected. 
    • Got notification "Unable to write to super_cache" (super_cache is the cache pool). Paused disk-clear and rebooted the server.
    • Same error upon reboot. In the interest if troubleshooting, I increased docker image size to see if that was the issue but the service still wouldn't start. I AM able to see/read files on cache drive but can't write to it. A simple mkdir command in appdata share errors saying it's a read-only file system. 

     

    My best guess is that both nvme drives failed? Or maybe the pci-e adapter they are in failed? Any thoughts or clues from the attached diagnostics as I wait for the replacement drives to arrive?

    diagnostics-20231025-1118.zip

  7. Thanks to help and recommendations from @JorgeB, I've learned that my cache pool (2 nvme drives set to mirror) have some uncorrectable errors (based on Scrub results). THIS older thread recommends backing the cache pool files onto the array, wiping/reformatting the drives, and moving the files back onto the cache pool. 

     

    What is the best practice for moving 600GB from these onto the array? Rsync via webUI terminal? Krusader? Something else? 

     

    And for the "wiping/reformatting" portion, is this the proper command?

     

    blkdiscard /dev/nvmeX

     

    image.png.21a845238d735d2200eb1d8b915f822d.png

  8. My Unraid server was non-responsive so I had to force reboot via IPMI. Upon reboot, I am getting the following error and the docker tab is showing no docker containers installed:

     

    BTRFS: error (device nvme1n1p1) in btrfs_replay_log:2500: errno=-5 IO failure (Failed to recover log tree)

     

    image.thumb.png.8faba6656746e0df30373159e8f7853c.png

     

    I came across THIS post which seems relevant but their error was slightly different. Thoughts on how to proceed? (diags attached)

     

    EDIT: Here is another clue. The cache pool on which docker.img lives is showing unmountable:

    image.thumb.png.1e87a63351311fd32cec2d693f33ca81.png

     

    corsermoon-diagnostics-20230615-1340.zip

    • Thanks 1
  9. Hi all. Woke up this morning to Organizr not working (throwing "not writeable" error) as well as many other dockers not operating as expected. Next step was checking the log file which is 100% full. All disks/pools/shares are green and readable though. Log filled up with BTRFS and rsyslog write errors (I am using syslog server). Before I reboot to clear the log file, wanted your expert eyes on. 

    executor-diagnostics-20220714-1053.zip

  10. So earlier today I suddenly lost connection to my unraid box. After troubleshooting, determined that the NIC is dead (Mellanox ConnectX-2). So I IPMI'd into the motherboard and used the iKVM console to log into unraid via CLI and issued the command 'powerdown'. Problem is that it has been sitting at  'Shutdown Nginx gracefully...' for 30 minutes. Do I have any options besides power cycling it? Really trying to avoid that and the 30 hour parity check.

     

    image.png.be6e4239dc7c8b6429269a93b837cbf2.png

  11. 26 minutes ago, ljm42 said:

    OK if you are accessing by IP then DNS isn't the issue. Sorry, all of the tips I have are in the first two posts, I don't have any other ideas.

    Sent from my GM1917 using Tapatalk
     

    I'm thinking it is either weirdness with my gateway (ATT fiber gateway) or corruption/conflicts with the unraid routing table. I may try resetting the unraid network settings so see if that helps. I'm also in hte process of building a pfsense box and bypassing the gateway. Hopefully one of those fixes the issue. 

  12. 2 minutes ago, ljm42 said:

    Give me some examples of things you are trying to access. http://what

    Sent from my GM1917 using Tapatalk
     

    With only my router IP as the DNS, I can only access unraid (192.168.1.107) but no internet (http://www.google.com for example) and no other devices on my LAN such as 192.168.1.254 (router), 192.168.1.111 (managed switch) or 192.168.1.201 (Hubitat), etc. If I add 8.8.8.8 to the DNS record (so it's then 192.168.1.254,8.8.8.8) I can access unraid (192.168.1.107) and the internet (Google, etc), but still no other LAN IPs. Right now I'm at my in-laws on their network which is 192.168.68.x so that shouldn't be a conflict. 

  13. 21 minutes ago, Nodiaque said:

    Hello everyone, I seems to have a common issue and I cannot find the problem.

     

    I've setup wireguard with 8.8.8.8 as dns. I have Host Access Enabled because if I don't, my pihole running on br0 cannot be contacted. Local server uses nat to no, peer type of access to Remote access to LAN.

     

    I also added 2 rules in my pfsense

    source: 10.253.0.0/24 (vpn)

    destination: unraid ip

    protocol: any

     

    and

    source: 10.253.0.0/24 (vpn)

    destination: lan ip address

    protocol: any

     

    With that, I can access the Internet through my VPN and I can reach my unraid server, but I cannot access anything else on the network (neither docker container with there own IP or other device on the network). I don't have vlan, thus all my devices are on the same subnet, same as my server and my docker with fixed ips.

     

    Is there a way to have that?

     

    Thank you

     

    Yea, similar issue to me (though I don't use pihole). I can only access unraid when i have the DNS set to my router but no internet and no LAN. If I add a public DNS like 8.8.8.8, I can then access internet, but still no LAN. I've read through dozens of threads and reddit posts and still have been unable to get local LAN access to work. 

  14. 5 minutes ago, Shazster said:

    Pheeew!!!. Good to hear. I couldn't help but forward the subject title to one of my friends who is a sysadmin professionally and father of three. He seemed quite impressed yours skipped the power button pressing phase entirely and went straight for drive yanking.

    Yeah, I'm now in the process of looking into getting a locking cabinet...

×
×
  • Create New...