BTRFS Errors on boot


lmanstl

Recommended Posts

I was in the process of reconfiguring the array to consolidate pools and add some new drives to the array when some of my cache drives spontaneously disconnected and that prevented the array from starting (this happens sometimes when I try to start the array normally and a reboot fixes it, I traced the problem to some iffy connectors in my JBOD but they usually work so I deal with it). I attempted to soft reboot using the UI button, but nothing happened. After letting it sit for 30 minutes with nothing visible happening, I hard rebooted. 

 

On the next boot there were some BTRFS errors:

 

BTRFS: error (device sdaf1) in cleanup_transaction:1942: errno=-28 No space left

BTRFS: error (device sdaf1) in reset_balance_state:3327: errno=-13 Readonly file system

 

These errors occurred immediately after the login prompt shows and the web UI does not start or is not accessible. The integrated graphics on the motherboard of the server do not support the local booting of the web UI so I can't check that. The command line still works though.

 

Soft rebooting after these errors causes it to get stuck on "Starting diagnostics collections..." for more than 20 minutes (I hard rebooted after that). 

 

Searching for these errors led me to this link: https://www.suse.com/support/kb/doc/?id=000019843 but, the extent of my BTRFS knowledge ends at using the unraid gui and I have no idea what the mountpoint would be if that solution would even work in this case.

 

Any assistance in the issue would be appreciated, Thanks. 

Link to comment

Cache pool has too many missing devices:

 

Oct 14 07:52:17 My-NAS emhttpd: shcmd (98): mkdir -p /mnt/cache
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache uuid: 1eaddf7b-e831-47dc-89f0-98ed5e0d8894
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache TotDevices: 9
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache NumDevices: 6
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache NumFound: 6
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache NumMissing: 3
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache NumMisplaced: 0
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache NumExtra: 0
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache LuksState: 0
Oct 14 07:52:17 My-NAS emhttpd: /mnt/cache mount error: Too many missing/misplaced devices

 

Plex pool has one misplaced device and is crashing on mount, post the output of:

 

btrfs fi usage -T /mnt/plex

 

Link to comment
root@My-NAS:~# btrfs fi usage -T /mnt/plex
Overall:
    Device size:                 476.95GiB
    Device allocated:            238.47GiB
    Device unallocated:          238.47GiB
    Device missing:                  0.00B
    Used:                        181.44GiB
    Free (estimated):            294.00GiB      (min: 294.00GiB)
    Free (statfs, df):           238.47GiB
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 0.00B)
    Multiple profiles:                  no

              Data      Metadata System
Id Path       single    single   single   Unallocated
-- ---------- --------- -------- -------- -----------
 2 /dev/sdae1 235.44GiB  3.00GiB 32.00MiB     1.02MiB
 3 /dev/sdw1          -        -        -   238.47GiB
-- ---------- --------- -------- -------- -----------
   Total      235.44GiB  3.00GiB 32.00MiB   238.47GiB
   Used       179.91GiB  1.53GiB 48.00KiB


 

The reason the cache pool may be devices is because that was one of the pools I was altering. It was empty and I was trying to remove a failing drive (only 1 so I am not sure why there are 3 missing devices). However, all the drives that were there before are still present and they all show up in the RAID controller configuration utility. I am not sure why plex seems to be set to single mode. I remember setting it to RAID1 when I made it.

Link to comment
24 minutes ago, lmanstl said:

I am not sure why plex seems to be set to single mode.

Because according to Unraid one of them is not correctly assigned.

 

File system is fully allocated, that's why it's running out of space, a balance would fix it but it won't work with the misassigned device, since the pool goes read only after the error.

 

If array auto-start is enable disable it by editing /config/disk.cfg in the flash drive and changing startArray="yes" to "no", then reboot and post a screenshot of main before array start.

Link to comment

Ok, I think I found the issue. The plex pool was looking for the sdae drive, but that drive path was reassigned to another disk after the shuffle. That disk is located in the virtualmachine pool which is initialized first before the plex pool normally. The reason it was causing problems when starting the array is because plex is where my appdata and system shares are located. Disabling docker allowed the array to start fine and I was able to format the other pools with changes and begin clearing the new data disks in the array.

 

I do not currently have docker running and the plex pool is still read only. I am initiating a full back up of all files on the pool and then I will figure out what to do. I am considering just deleting the pool and recreating it and then restoring the backups since the mover isn't doing anything after I switch from prefer to yes for the cache option or to another cache pool. Unless you have any other ideas to make it work.

 

I will not have physical access to the machine for the next week and a half so any solutions would have to be software solutions.

Screenshot 2021-10-14 112502.png

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.