BTRFS cache read only mounted


WEHA

Recommended Posts

Hello

 

So I had this issue where unraid started throwing errors, I believe because of cache drive disconnect (if that's possible, doesn't it run from the usb drive?)

I set the array to not start on boot but because of the errors it apparently didn't save this preference.

Anyway, after rebooting (not detecting the second cache drive) it started the array, without the second cache drive. (raid1)

It was mounted read only, I guess because of the disconnect earlier.

I rebooted to get the second drive detected again and I tried to re-add the drive in unraid.

It remained mounted read only and btrfs was still saying that it was missing the drive, even though it was looking correct in the unraid gui.

After searching the internet I found the command to replace the drive in the raid 1 (adding redetected drive and removing missing drive).

But I'm still at the problem where it says the drive is mounted read only.

When I execute mount it says rw, so it's btrfs not allowing me to write.

The only thing I could find for this situation is that there needs to be a kernel patch to get this working.

I'm not familiar how to check or install this patch in unraid.

 

Source: 

https://www.mail-archive.com/search?l=linux-btrfs%40vger.kernel.org&q=subject:"raid1\%3A+cannot+add+disk+to+replace+faulty+because+can+only+mount+fs+as+read\-only."&o=newest

https://www.mail-archive.com/[email protected]/msg60979.html

 

Any suggestions?

 

EDIT: added diagnostics

tower-diagnostics-20180414-0850.zip

Edited by WEHA
Link to comment

There were ENOSPC errors during the balance of the pool:

 

Apr 14 06:24:19 Tower kernel: BTRFS: error (device nvme0n1p1) in btrfs_remove_chunk:2882: errno=-28 No space left
Apr 14 06:24:19 Tower kernel: BTRFS info (device nvme0n1p1): forced readonly
Apr 14 06:24:19 Tower kernel: BTRFS info (device nvme0n1p1): 92 enospc errors during balance

You're best option is to backup any data on the pool, format and restore, you can use the link below for help with the backup:

https://lime-technology.com/forums/topic/46802-faq-for-unraid-v6/?do=findComment&comment=543490

 

Link to comment

Thank you for your quick reply.

I was able to copy the data earlier, I hope there is no corruption.

What are ENOSPC errors exactly? Can't seem to find a simple description :)

 

What would be the best way to format the drivers:

- using the gui: just delete the partition and re-add as cache?

- is a full wipe necesarry?

- a btrfs command

 

thanks!

 

 

Link to comment
44 minutes ago, WEHA said:

There was 100GB of free space? How is there no space left?

There could be an allocation or corruption problem resulting in the errors.

 

44 minutes ago, WEHA said:

Shares are emtpy but appdata & system was on my cache.

This will restore itself once I restart unraid?

Not quite clear on what you're asking, shares are just all top level folder, both on the data disks and cache pool.

Link to comment
15 minutes ago, johnnie.black said:

Post new diagnostics

Everything is now copied back, stopped and started array: no exportable shares.

New diag attached

tower-diagnostics-20180414-1538.zip

 

Something strange though when I "ls /mnt":


16K drwxrwxrwx  1 nobody users 106 Apr 14 15:32 cache/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk1/
  0 drwxrwxrwx  4 nobody users  43 Apr 14 10:00 disk2/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk3/
  0 drwxrwxrwx  3 nobody users  19 Apr 14 10:00 disk4/
  0 drwxrwxrwx 11 nobody users 167 Apr 14 10:00 disk5/
  0 drwxrwxrwx  5 nobody users 100 Apr 14 15:35 disks/
  ? d?????????  ? ?      ?       ?            ? user/
  0 drwxrwxrwx  1 nobody users  19 Apr 14 10:00 user0/
 

/bin/ls: cannot access 'user': Transport endpoint is not connected
 

 

When I "ls user0", those contain the non-cache shares

Edited by WEHA
Link to comment
9 minutes ago, johnnie.black said:

 you should also update to latest.

I wasn't aware of a new version until today since it did not (still isn't) mention this in the GUI like it did last time.

Updating is the last on my todo list :)

 

So next problem, I rebooted and offcourse my second cache drive got "undetected" again.

Updated ssd firmware this time (bios this morning) hoping this fixes it.

Reboot, reassigned second drive... and now it's showing as RAID 0 in terms of space.

 

Tried rebalance with -dconvert=raid1 -mconvert=raid1 but did nothing.

Do I have to convert it to single first?

 

Dashboard shows 1.5TB size,  785GB in use

Status (these are 2 x 1TB nvme SSD's fyi):

btrfs filesystem df:
Data, RAID1: total=732.00GiB, used=731.15GiB
System, RAID1: total=32.00MiB, used=144.00KiB
Metadata, RAID1: total=2.00GiB, used=908.31MiB
GlobalReserve, single: total=512.00MiB, used=0.00B
btrfs balance status:
No balance found on '/mnt/cache'
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.