ANJ_ Posted September 17, 2024 Posted September 17, 2024 (edited) I've admittedly been having a lot of trouble with my server ever since my flash drive died and I rebuilt a new flash, most things I've been able to get passed but it seems like about once a week some container or issue pops up needing a restart. Today however the Fix Common Problems app is saying that my cache drive is read only, out of no where. This did actually happen once before after rebuilding my flash drive but all was resolved when I ran a scrub on the cache drive. This time though unraid doesn't seem to be allowing me to even attempt a scrub. When hitting the scrub button it just shows 'aborted' with no attempt. I tried switching out the sata cable as I read that could be a problem despite not getting any UDMA errors, but that didn't help either. My logs show a handful of concerning looking errors but as I am fairly new to the whole server and networking world I don't know where to go with it. Attached diagnostics as well as logs, some things I'm seeing are: This error seems to be spamming quite often on startup.. Though it doesn't appear to be the cause of my suddenly read-only drive, maybe it's related. Quote Sep 17 10:47:00 ANJNAS nginx: 2024/09/17 10:47:00 [error] 8183#8183: *3471 limiting requests, excess: 20.987 by zone "authlimit", client: 192.168.1.23, server: , request: "GET /login HTTP/1.1", host: "192.168.1.9", referrer: "http://192.168.1.9/Main/Settings/Device?name=cache" Here is where the drive seems to suddenly switch to read-only: Quote Sep 17 10:44:43 ANJNAS kernel: Call Trace: Sep 17 10:44:43 ANJNAS kernel: <TASK> Sep 17 10:44:43 ANJNAS kernel: ? __warn+0xab/0x122 Sep 17 10:44:43 ANJNAS kernel: ? report_bug+0x109/0x17e Sep 17 10:44:43 ANJNAS kernel: ? __btrfs_free_extent+0x4cf/0xc02 Sep 17 10:44:43 ANJNAS kernel: ? handle_bug+0x41/0x6f Sep 17 10:44:43 ANJNAS kernel: ? exc_invalid_op+0x13/0x60 Sep 17 10:44:43 ANJNAS kernel: ? asm_exc_invalid_op+0x16/0x20 Sep 17 10:44:43 ANJNAS kernel: ? __btrfs_free_extent+0x4cf/0xc02 Sep 17 10:44:43 ANJNAS kernel: ? _raw_read_trylock+0x36/0x5c Sep 17 10:44:43 ANJNAS kernel: ? btrfs_merge_delayed_refs+0x66/0x16e Sep 17 10:44:43 ANJNAS kernel: __btrfs_run_delayed_refs+0x698/0xbe2 Sep 17 10:44:43 ANJNAS kernel: btrfs_run_delayed_refs+0x65/0x146 Sep 17 10:44:43 ANJNAS kernel: ? start_transaction+0x1fe/0x44d Sep 17 10:44:43 ANJNAS kernel: btrfs_commit_transaction+0x76/0xa79 Sep 17 10:44:43 ANJNAS kernel: ? start_transaction+0x3dd/0x44d Sep 17 10:44:43 ANJNAS kernel: ? schedule_timeout+0x5a/0xd7 Sep 17 10:44:43 ANJNAS kernel: transaction_kthread+0x105/0x17b Sep 17 10:44:43 ANJNAS kernel: ? btrfs_cleanup_transaction.isra.0+0x3cc/0x3cc Sep 17 10:44:43 ANJNAS kernel: kthread+0xe4/0xef Sep 17 10:44:43 ANJNAS kernel: ? kthread_complete_and_exit+0x1b/0x1b Sep 17 10:44:43 ANJNAS kernel: ret_from_fork+0x1f/0x30 Sep 17 10:44:43 ANJNAS kernel: </TASK> Sep 17 10:44:43 ANJNAS kernel: ---[ end trace 0000000000000000 ]--- Sep 17 10:44:43 ANJNAS kernel: BTRFS: error (device sdb1: state A) in __btrfs_free_extent:3072: errno=-2 No such entry Sep 17 10:44:43 ANJNAS kernel: BTRFS info (device sdb1: state EA): forced readonly Sep 17 10:44:43 ANJNAS kernel: BTRFS error (device sdb1: state EA): failed to run delayed ref for logical 208224256 num_bytes 16384 type 176 action 2 ref_mod 1: -2 Sep 17 10:44:43 ANJNAS kernel: BTRFS: error (device sdb1: state EA) in btrfs_run_delayed_refs:2149: errno=-2 No such entry I don't know what to make of these logs but it shows where the drive is suddenly "forced readonly". Before that point if I were to run Fix Common Problems it wouldn't show the unable to write error but after that point it will. I've allocated 30GB for Docker and only 17GB of that is being used, so the docker itself doesn't appear to be full. I'm debating rebuilding docker if it's recommended but am also a bit afraid of going that route before trying anything else as it feels a bit nuclear and could be a pain getting everything back up and running again. I'm hoping someone can help point me in the right direction here, I'm at a loss. Having issues with most of my containers because of this. Also, random bonus error I don't really care much about, but fix common problems also lists Write Cache as being disabled on disk1 even though when I check the disk or switch on the write cache with "hdparm -W 1 /dev/(diskID)" it still says it's disable in fix common problems. This I can just ignore though for now as I obviously have bigger fish to fry. anjnas-diagnostics-20240917-1050.zip anjnas-syslog-20240917-1750.zip Edited September 17, 2024 by ANJ_ Quote
Solution JorgeB Posted September 17, 2024 Solution Posted September 17, 2024 With this type of error with btrfs, I always recommend backing up the pool, then re-formatting. Quote
ANJ_ Posted September 17, 2024 Author Posted September 17, 2024 Just now, JorgeB said: With this type of error with btrfs, I always recommend backing up the pool, then re-formatting. I've read some people also recommending switching to XFS? I think they said specifically if you only have 1 cache drive, which I do. Do you recommend this as well? I'm also not sure of the best practice or best way to go about backing up the pool. I have the Appdata Backup plugin, but wouldn't I still need to rebuild each individual container, I guess through templates, and then run the Appdata Backup restore? I just imagine I'm going to have to be doing a lot of reconfiguring which of course is a pain. Quote
JorgeB Posted September 17, 2024 Posted September 17, 2024 With a single device, and if you don't care for checksums or snapshots, XFS can be a good option, it's generally more robust than btrfs, especially with marginal hardware. Quote
ANJ_ Posted September 17, 2024 Author Posted September 17, 2024 (edited) 2 minutes ago, JorgeB said: With a single device, and if you don't care for checksums or snapshots, XFS can be a good option, it's generally more robust than btrfs, especially with marginal hardware. Ok, noted, thank you. How about the pool backup method, is what I previously mentioned the way to go about it or is there an easier solution I may be unaware of? If you don't mind some input. Edited September 17, 2024 by ANJ_ Quote
ANJ_ Posted September 18, 2024 Author Posted September 18, 2024 Is there a prerequisite to changing the file system of the cache? The File system type is not selectable for me. Quote
JorgeB Posted September 18, 2024 Posted September 18, 2024 You may need to click the "erase" button first for that pool, in the same page as where you select the filesystem. Quote
ANJ_ Posted September 18, 2024 Author Posted September 18, 2024 10 hours ago, JorgeB said: You may need to click the "erase" button first for that pool, in the same page as where you select the filesystem. That was the ticket, thank you. 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.