NIronwolf Posted January 11, 2023 Share Posted January 11, 2023 First, I realize I have a flakey port replicator or something. It can sometimes drop a drive. I had a drive drop from the array. I took a diagnostic and then rebooted. Then took a second diagnostic. I toggled the drive to empty then back to the original drive. I've done this in the past and it just rebuilt and everything was fine. However, today ALL the disks say "Unmountable: Volume not encrypted". I'm worried it already killed everything as it auto started the rebuild, but I hit stop right away. No it's trying to Stop the array but not finishing. It's reporting the drive activity is 0, but the status at the bottoms says "Array Stopping • Retry unmounting disk share(s).." Have I just corrupted everything? reality-diagnostics-20230111-0735.zip reality-diagnostics-20230111-0841.zip Quote Link to comment
JorgeB Posted January 11, 2023 Share Posted January 11, 2023 Btrfs is detecting data corruption on all pool members, that and the fact that there are a lot of unrelated crashes makes me think there might be a RAM problem, start by running memtest. Do you have a backup of the LUKS headers? It might be needed if they were corrupted. 1 Quote Link to comment
NIronwolf Posted January 11, 2023 Author Share Posted January 11, 2023 I don't know how to make a backup of the LUKS headers. (I recognize the word though and that worries me greatly.) The btrfs is only for one of the cache pools. I use ext4 on all the array drives (encrypted). Quote Link to comment
NIronwolf Posted January 11, 2023 Author Share Posted January 11, 2023 Memory checked fine (as I suspected it would). The errors are the flakey connection to my external drive enclosure over eSATA I believe. Quote Link to comment
Solution JorgeB Posted January 11, 2023 Solution Share Posted January 11, 2023 That should not cause btrfs corruption errors, a single pass of memtest is no proof that everything is OK, just that if it's a RAM problem it's not a major one, run a correcting scrub then if no errors are found or all are corrected reset the stats and keep monitoring for new errors. For the other problem, looks more like an issue caused by all the crashes, reboot and post new diags after array start. 1 Quote Link to comment
NIronwolf Posted January 11, 2023 Author Share Posted January 11, 2023 Oh, and the 4 drives that are setup btrfs aren't connected that way. I'd guess any corruption there would have been from the system locking up and having to hard power it off. Ok, this time all the disks mounted normally and show they're xfs. (With the disk that dropped being rebuilt.) Running that scrub on my scratch cache drives now. I guess it just needed me to tattle on it to do what it normally does when I have to reset a drive. Quote Link to comment
NIronwolf Posted January 11, 2023 Author Share Posted January 11, 2023 Oh, here's a diagnostic grabbed while it's starting all that work. If all goes normally I can say it's rebuilt in like 36 hours. Probably don't need to look at this diag unless something more goes haywire. Thank you for your assistance JorgeB! reality-diagnostics-20230111-1150.zip Quote Link to comment
JorgeB Posted January 11, 2023 Share Posted January 11, 2023 35 minutes ago, NIronwolf said: I'd guess any corruption there would have been from the system locking up and having to hard power it off. Nope, that also should not cause that, diags look normal for now, wait for the scrub results. Quote Link to comment
JorgeB Posted January 11, 2023 Share Posted January 11, 2023 And make sure you make a backup of the LUKS headers, e.g.: 2 Quote Link to comment
NIronwolf Posted January 12, 2023 Author Share Posted January 12, 2023 Rebuild and scrub completed. All looks good. Thanks again for your guidance JorgeB! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.