Mat1926 Posted May 2, 2018 Share Posted May 2, 2018 @johnnie.black @jonathanm @trurl unraid-diagnostics-20180503-0119.zip *edit* The error column in the Main page is showing 0 for all disks... *edit2* Is it possible that this error is related to my previous post Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 Run a correcting check, sync error is on parity2 only so not related to previous issue. Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 44 minutes ago, johnnie.black said: Run a correcting check, sync error is on parity2 only so not related to previous issue. I just started a correcting check, can you guess what caused this?! *edit* Why didn't I get a notification about this? Or perhaps these errors can't be detected in real-time? Thnx Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 59 minutes ago, Mat1926 said: can you guess what caused this?! Any unclean shutdown since last check? That would be the most likely cause. 59 minutes ago, Mat1926 said: Why didn't I get a notification about this? You should get a notification at the end of the check showing the number of errors. Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 7 minutes ago, johnnie.black said: Any unclean shutdown since last check? That would be the most likely cause. You should get a notification at the end of the check showing the number of errors. Before purchasing my UPS, I've had one unclean shutdown, but the automatic parity check afterward did not report any errors at all... When I asked about the notification, I meant while using the system and before doing any parity checks, isn't the system capable of detecting such errors and reporting them? Or these errors can't be detected w/o a full parity check? Thnx Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 5 minutes ago, Mat1926 said: Or these errors can't be detected w/o a full parity check? This Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 3 minutes ago, johnnie.black said: This So, this does not mean that the files on the data disks are actually corrupt...correct? Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 Data should be fine since it agrees with parity1. Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 7 minutes ago, johnnie.black said: Data should be fine since it agrees with parity1. Can you plz tell me where exactly can I see the issue with parity 2, in which log file? Thnx Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 May 2 21:54:02 unRaid kernel: md: recovery thread: Q incorrect, sector=16883534704 Q is parity2, P would be parity1. P Q would mean both were incorrect. Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 @johnnie.black When the process finishes (~22 hours), shall I run it again w/o corrections? What about using the system now, shall I leave it until it ends? Thnx Link to comment
JorgeB Posted May 3, 2018 Share Posted May 3, 2018 You can, but it's probably best to confirm this was a one time thing. You can use the array but avoid large IO operations if possible, as the array will be slower and also slowdown the check. Link to comment
Mat1926 Posted May 3, 2018 Author Share Posted May 3, 2018 @johnnie.black @jonathanm @trurl I was thinking about the fact that I've had an error that I did not discover until I ran parity check, what if a drive failed before correcting this error...What can I do in that case? The parity check with corrections is still running, and it still needs at least 8 more hours... Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 18 hours ago, johnnie.black said: You can, but it's probably best to confirm this was a one time thing. You can use the array but avoid large IO operations if possible, as the array will be slower and also slowdown the check. It just finished the parity check with correcting w/o detecting any errors or making any changes! How can this be?! What is the story here? Would you plz look in to this? unraid-diagnostics-20180504-0709.zip Link to comment
JorgeB Posted May 4, 2018 Share Posted May 4, 2018 That's unusual, especially since according to your sig you're using ECC, you'll need to monitor the next ones, i.e., if there are more errors are they always on parity2. Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 1 minute ago, johnnie.black said: That's unusual, especially since according to your sig you're using ECC, you'll need to monitor the next ones, i.e., if there are more errors are they always on parity2. What shall I do now?! More tests or what exactly?! Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 3 minutes ago, johnnie.black said: That's unusual, especially since according to your sig you're using ECC, you'll need to monitor the next ones, i.e., if there are more errors are they always on parity2. This is my RAM, would you double check plz?! Link to comment
JorgeB Posted May 4, 2018 Share Posted May 4, 2018 Continue using the array normally and run a couple more non correct checks. There's also the possibility that something was written to the previous wrong sector after the first check updating parity, so now it would be correct, if it wasn't that you likely have an hardware problem somewhere. Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 5 minutes ago, johnnie.black said: Continue using the array normally and run a couple more non correct checks. There's also the possibility that something was written to the previous wrong sector after the first check updating parity, so now it would be correct, if it wasn't that you likely have an hardware problem somewhere. Nothing was written since the array is mounted in read-only! Did you look at the RAM part number?! *edit* Is it safe to assume that the data integrity is valid?! Link to comment
John_M Posted May 4, 2018 Share Posted May 4, 2018 2 minutes ago, Mat1926 said: Did you look at the RAM part number?! http://www.samsung.com/semiconductor/dram/module/M391A1G43DB0-CPB/ Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 Just now, John_M said: http://www.samsung.com/semiconductor/dram/module/M391A1G43DB0-CPB/ Is it safe to assume that the data integrity is okay and that I would be able to re-build in case needed?! Link to comment
John_M Posted May 4, 2018 Share Posted May 4, 2018 Not if you've got a hardware fault. Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 1 minute ago, John_M said: Not if you've got a hardware fault. Since Parity is okay now, doesn't this confirm that the data is okay? Link to comment
JorgeB Posted May 4, 2018 Share Posted May 4, 2018 Data should be OK, but the only way to be sure would be if you have checksums (or were using btrfs). Link to comment
Mat1926 Posted May 4, 2018 Author Share Posted May 4, 2018 Just now, johnnie.black said: Data should be OK, but the only way to be sure would be if you have checksums (or were using btrfs). Is there a method to do that?! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.