eweitzman Posted September 17, 2014 Share Posted September 17, 2014 Hi. I replaced a 3TB parity drive with a new 6TB WD red. After everything was done, I ran a parity check. Sometime between starting it in the morning and coming home from work in the evening, a few drives went offline and the parity check logged about 2 million sync errors per drive. The process I followed was: - ran a parity check with old 3TB parity drive - pre-cleared the new 6TB drive - replaced 3TB parity drive with 6TB drive - rebuilt parity on 6TB drive - ran a parity check with 6TB drive. This is where the drives went offline. I checked all connections and restarted the system. All drives and controllers seem okay, and the array comes on line okay with all green balls. Then I started a read-only parity check. About 20 minutes in, sync errors started to show up. I see two options after I check all the hardware. Option 1. Assume the parity is bad on the 6TB drive. Run a parity check to update the 6TB drive again. Option 2. Put the original 3TB parity drive back in. Run a read-only parity check and if all is well, pre-clear the 6TB drive and start over. If there are problems with the 3TB read-only parity check, some of data drives have been corrupted. Find a new course of action. Option 1 will be much faster. It took 60 hours to pre-clear the 6TB drive with speeds at 100-140Mb/s. But it doesn't tell me the state of the other drives in the array. Thanks for any advice, - Eric Quote Link to comment
dgaschk Posted September 17, 2014 Share Posted September 17, 2014 See here: http://lime-technology.com/forum/index.php?topic=9880.0 Quote Link to comment
bombz Posted September 17, 2014 Share Posted September 17, 2014 Any issues with users using 6TB drives in their servers, I assume UnRAID supports that disk size ? First related post I have seen with issues regarding 6TB disk Quote Link to comment
dgaschk Posted September 18, 2014 Share Posted September 18, 2014 Version 5 should support drives of up to 16TB. Quote Link to comment
eweitzman Posted September 18, 2014 Author Share Posted September 18, 2014 A syslog of good/running system is attached. There were no logs saved to /boot/logs during the upgrade process even though clean shutdowns were done ever time. unRAID version is 5.0.5. 17 SATA drive array. Drives connected to MSI-7512 motherboard, SAS2LP-MV8, and a 2-port SATA card. 2GB ram. PATA cache drive. - Eric syslog.txt Quote Link to comment
megalodon Posted September 18, 2014 Share Posted September 18, 2014 I had no issues upgrading my Parity to 6TB. All checks came back fine. Looks like you may have a corrupted data drive if your parity preclear results came back good. I would look at Option 2 just to be safe especially if some of your data drives are old. Quote Link to comment
Squid Posted September 18, 2014 Share Posted September 18, 2014 What controller card are you using? I had a similar issue with my Supermicro AOC-SAS2LP-MV8. Spurious reports on the web about when you're really taxing them that errors can happen. In my case, I found that if I set my tunables via unraid-tunables-tester.sh to the max I would get the errors during parity checks. If I changed it to best bang for the buck, the errors all disappeared. Quote Link to comment
eweitzman Posted September 19, 2014 Author Share Posted September 19, 2014 It doesn't seem like there are any real options. If I put the old 3TB parity drive back in, the array will think it's a new drive and will want to initialize it, since the array was last running with the 6TB drive for partiy. So if that's true, and the 3TB drive with valid parity is useless now to check the array, I'll have to rebuild parity on the 6TB drive and hope there's no corruption on any other drive. Quote Link to comment
dgaschk Posted September 19, 2014 Share Posted September 19, 2014 It doesn't seem like there are any real options. If I put the old 3TB parity drive back in, the array will think it's a new drive and will want to initialize it, since the array was last running with the 6TB drive for partiy. So if that's true, and the 3TB drive with valid parity is useless now to check the array, I'll have to rebuild parity on the 6TB drive and hope there's no corruption on any other drive. Reset the config using the New Config Utility. Select the desired drives, including the 3T parity, and check the box that indicates parity is good. Quote Link to comment
eweitzman Posted September 20, 2014 Author Share Posted September 20, 2014 That's fantastic. Thanks! I didn't know you could keep parity with new config/initconfig. I'll get the array back up this way and then do a parity check to see if anything has gone wrong on the data drives. Is there a way to identify which files may be affected if there are sync errors found? Since the data and parity are striped across all drives, I wouldn't think the disk with a bad file could be identified. Quote Link to comment
itimpi Posted September 20, 2014 Share Posted September 20, 2014 Is there a way to identify which files may be affected if there are sync errors found? Since the data and parity are striped across all drives, I wouldn't think the disk with a bad file could be identified. I do not believe that you can. You only know that there is a problem with a particular sector on one of the drives. Even if you know the drive there is no easy (i.e. realistic) way to convert a sector to the file that contains the sector. Quote Link to comment
garycase Posted September 20, 2014 Share Posted September 20, 2014 Since the data and parity are striped across all drives, I wouldn't think the disk with a bad file could be identified. The data is NOT striped across all the drives -- it's only on the drive where you wrote the particular file. Parity is simply computed across all drives, so when you encounter a sync error that simply means that the current parity bit doesn't match what it should be. UnRAID always assumes the error is in the parity bit itself, since that's by far the most likely. It would, with the right utility, be possible to identify the SET of files that might be involved ... by identifying the file on every data disk that includes the bit where the error was - but to my knowledge there are no utilities available to do that. It would indeed be handy, however, as you could then just check those specific files (either by verifying checksums or by comparing them to your backups) instead of having to do that check for the entire array to confirm a sync error didn't result from file corruption. Since all the drives show good status, I'd simply recomputed parity (or just run a correcting check, which will effectively do the same thing) ... then run another parity check, which should be error-free. Quote Link to comment
eweitzman Posted September 21, 2014 Author Share Posted September 21, 2014 The new config operation and read-only parity check with the old 3TB parity drive are complete. The parity check immediately showed 1867 sync errors (Main | Array Operations page), then completed a half day later with no more sync errors. There were zero disk errors (Main | Array Devices page). syslog showed about 40 parity errors, one in sector 128 and all the rest in consecutive sectors (counting by starting at sector 12584. All the top level directories look like they should and the drives look as full as they should. So it looks like my data is intact, with possibly a bit of loss or a bit of bad parity. Next is to run smart tests and maybe reiserfsckd. Then I'll start over, pre-clear and install the new 6TB parity drive. Thanks for all the hand holding. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.