August 13, 20232 yr Hi all, Log has filled up with BTRFS errors. I ran a correcting scrub and got: UUID: 22930f84-d034-4339-bdca-5b002a3e45e3 Scrub started: Sat Aug 12 19:47:45 2023 Status: finished Duration: 0:19:45 Total to scrub: 354.29GiB Rate: 306.17MiB/s Error summary: read=46387735 super=2 Corrected: 0 Uncorrectable: 46387735 Unverified: 0 Attaching diagnostics. Any help appreciated. Thanks! Edited December 2, 20241 yr by spall Remove attachment
August 13, 20232 yr Diags don't show the start of the problem since the syslog rotated, but one of the cache devices dropped offline, could be a cable problem, but since it's an MX500 update the firmware, then run another correcting scrub.
August 14, 20232 yr Author @JorgeB Thanks for the response! I rebooted the server (which unfortunately started a parity check). It was hung forever on trying to unmount the MX500 and then must have shutdown unclean(?) So that's unfortunate. However, when it came back up it can't even see that drive anymore. So I'll try replacing the SATA cable and see, but I think given the age of that drive it might have given up the ghost. Regarding the firmware in that thread, do we consider the downloadable one safe? Or am I better putting that drive (if alive) and other MX500 I have in service into a Windows box and updating that way? I actually have a couple 500GB MX500 that I could put in to replace the pool should that drive indeed be gone, but I should update the firmware first either way it seems. Thank you.
August 14, 20232 yr 8 hours ago, spall said: Regarding the firmware in that thread, do we consider the downloadable one safe? Or am I better putting that drive (if alive) and other MX500 I have in service into a Windows box and updating that way? Either way should be OK, a power cycle (vs reboot) may bring the device back online.
August 14, 20232 yr Author @JorgeB Thanks again for the response. I think what I'm going to do is replace both drives in that pool to get it squared on work on the situation with the MX500 (and check all my other MX500) in my test system. Let me ask: Is the correct procedure to replace the problematic SSD with a new one, let the pool rebuild, then replace the other SSD, and let it rebuild again? Is it fine that I'll be using higher capacity and will it just grow the pool after the second rebuild to the new size? EDIT: Or would it make more sense to make the new pool ZFS? Thanks! Edited August 14, 20232 yr by spall
August 15, 20232 yr 12 hours ago, spall said: Let me ask: Is the correct procedure to replace the problematic SSD with a new one, let the pool rebuild, then replace the other SSD, and let it rebuild again? Is it fine that I'll be using higher capacity and will it just grow the pool after the second rebuild to the new size? Yes to both. 12 hours ago, spall said: Or would it make more sense to make the new pool ZFS? It's up to you, zfs is better when a device drops, it automatically syncs it back up once reconnected, with btrfs you need to run a scrub, in either case good to use this script for better pool monitoring.
August 15, 20232 yr Author 10 hours ago, JorgeB said: Yes to both. It's up to you, zfs is better when a device drops, it automatically syncs it back up once reconnected, with btrfs you need to run a scrub, in either case good to use this script for better pool monitoring. Funny story: I have a script. It apparently helps if you set a schedule for it to actually run. The curse of setting things up at 4am and thinking you've done it correctly. Maybe I'll just set a new pool up as ZFS and diagnose the other drives in the pool at my leisure. Thanks again.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.