June 15Jun 15 I stopped the array to remove a cache disk (I have 5 disk btrfs raid 5 as cache pool).I restarted the array, as expected the cache pool showed as degraded. I started a balance, which went surprisingly fast (too fast to have worked as I expected).I stopped the array again.I tried to start the array again, start array button did nothing. Pressed it 1-2 more times. No effect.I initiated a soft reboot.Then the GUI did not show up.I attached Monitor and Keyboard, soft rebooted once more. Still no GUIRebooted in GUI mode, GUI does not work there either (no GUI loading in localhost).Attached latest Diagnostics.Help would be much appreciated. Thanks.Edit: GUI also does not show up in safe boot gui mode. Edited June 17Jun 17 by ArdNsc
June 16Jun 16 Community Expert Solution Delete /boot/config/pools/cache.cfg on the flash drive, reboot and the GUI should be back, you will need to reimport the pool, can post the steps for that if you need them.
June 16Jun 16 Author 1 hour ago, JorgeB said:Delete /boot/config/pools/cache.cfg on the flash drive, reboot and the GUI should be backThat worked fine.1 hour ago, JorgeB said:you will need to reimport the pool, can post the steps for that if you need them.Please do. And do I add the pool with 4 or 5 disks? Remember, I was trying to remove one of them.
June 16Jun 16 Community Expert The problem is that since the pool is encrypted, you will need to decrypt it first to see the current status. You can do this; it won't damage the pool if it's cannot import it:add a new pool with the same name and 5 slotsAssign the 5 devices, leave the filesystem set to autostart the array and post new diags
June 16Jun 16 Community Expert Pool was imported degraded; I assume you still want to remove that device?
June 16Jun 16 Author Yes, I do. (But I did what you suggested and recreated the pool with all 5 devices.) Should I try to run another balance now? Edited June 16Jun 16 by ArdNsc
June 16Jun 16 Community Expert Since the device is already missing, the esay way is to remove it using the CLI, then reimport the pool with the 4 remianig devices:With the array started typebtrfs device remove missing /mnt/cacheOnce that's done, stop the array and reimport the poolon main click on the first device for that pool and then "remove pool"back on main, create a new pool with the same name and 4 slots nowassign the 4 current pool devices, leave the filesystem set to autostart the array to import the pool
June 16Jun 16 Author 8 minutes ago, JorgeB said:With the array started typebtrfs device remove missing /mnt/cacheOutput: Error: error removing device 'missing': Input/output error Edited June 16Jun 16 by ArdNsc
June 16Jun 16 Author Adds only one line to the syslog: Jun 16 15:35:21 Silver kernel: BTRFS info (device dm-7): relocating block group 27155866779648 flags data|raid5 Edited June 17Jun 17 by ArdNsc
June 16Jun 16 Community Expert The error suggests an i/o error with one of the remaining pool devices, but I don't see anything logged. If you try the command again, does it show the same error?
June 16Jun 16 Author I will be afk for a few hours now; in the meantime I try to move some files off the pool onto the array in case I have to create a new pool. No critical data on the pool (all backed up), but it would be much less pain if I didn't have to gather everything back together from backups.
June 16Jun 16 Community Expert See if this shows anything more:dmesg | grep -i btrfs | tail -n 10Also post the output frombtrfs device stats /mnt/cache
June 16Jun 16 Author 5 hours ago, JorgeB said:See if this shows anything more:dmesg | grep -i btrfs | tail -n 10Shows:[20036.654570] BTRFS: device fsid 0530b4fe-e499-4564-9465-33258fc10a71 devid 1 transid 236169 /dev/mapper/nvme1n1p1 (251:10) scanned by mount (1699987)[20036.654939] BTRFS info (device dm-10): first mount of filesystem 0530b4fe-e499-4564-9465-33258fc10a71[20036.654949] BTRFS info (device dm-10): using crc32c (crc32c-lib) checksum algorithm[20036.705027] BTRFS info (device dm-10): enabling ssd optimizations[20036.705031] BTRFS info (device dm-10): enabling free space tree[20036.707844] BTRFS info (device dm-10 state M): turning on async discard[20088.423435] BTRFS info (device dm-7): relocating block group 27155866779648 flags data|raid5[20150.855244] BTRFS info (device dm-7): relocating block group 27155866779648 flags data|raid5[20800.756898] BTRFS info (device dm-7): relocating block group 27155866779648 flags data|raid5[42723.637158] BTRFS info (device dm-7): relocating block group 27155866779648 flags data|raid55 hours ago, JorgeB said:Also post the output frombtrfs device stats /mnt/cacheShows:[/dev/mapper/sdl1].write_io_errs 0[/dev/mapper/sdl1].read_io_errs 0[/dev/mapper/sdl1].flush_io_errs 0[/dev/mapper/sdl1].corruption_errs 0[/dev/mapper/sdl1].generation_errs 0[/dev/mapper/sdm1].write_io_errs 0[/dev/mapper/sdm1].read_io_errs 0[/dev/mapper/sdm1].flush_io_errs 0[/dev/mapper/sdm1].corruption_errs 0[/dev/mapper/sdm1].generation_errs 0[/dev/mapper/sdk1].write_io_errs 0[/dev/mapper/sdk1].read_io_errs 0[/dev/mapper/sdk1].flush_io_errs 0[/dev/mapper/sdk1].corruption_errs 0[/dev/mapper/sdk1].generation_errs 0[/dev/mapper/sdj1].write_io_errs 0[/dev/mapper/sdj1].read_io_errs 0[/dev/mapper/sdj1].flush_io_errs 0[/dev/mapper/sdj1].corruption_errs 0[/dev/mapper/sdj1].generation_errs 0[devid:5].write_io_errs 0[devid:5].read_io_errs 0[devid:5].flush_io_errs 0[devid:5].corruption_errs 0[devid:5].generation_errs 0
June 17Jun 17 Community Expert Yep, nothing else logged, I'm afraid there may be an issue with the pool, a device, or it's hitting some btrfs raid5 bug.My recommendation would be to back up and recreate the pool, and if you still want raid5, use zfs raidz i nstead, btrfs raid5/6 is still considered experimental.
June 17Jun 17 Author I did what you recommended. Have a new Pool with zfs raidz1 now. Thanks for the help.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.