January 19, 201313 yr I decided to run the preclear script on the WD Green drive that had up until a few days ago been my parity drive. I did it on my testbed and it "finished" in 13 hours which was a little sooner than expected. Then I notice "SORRY: Disk /dev/sdb MBR could NOT be precleared". Odd? ================================================================== 1.13 = unRAID server Pre-Clear disk /dev/sdb = cycle 1 of 1, partition start on sector 64 = Disk Pre-Clear-Read completed DONE = Step 1 of 10 - Copying zeros to first 2048k bytes DONE = Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE = Step 3 of 10 - Disk is now cleared from MBR onward. DONE = Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4 DONE = Step 5 of 10 - Clearing MBR code area DONE = Step 6 of 10 - Setting MBR signature bytes DONE = Step 7 of 10 - Setting partition 1 to precleared state DONE = Step 8 of 10 - Notifying kernel we changed the partitioning DONE = Step 9 of 10 - Creating the /dev/disk/by* entries DONE = Step 10 of 10 - Verifying if the MBR is cleared. DONE = Elapsed Time: 13:21:15 ========================================================================1.13 == == SORRY: Disk /dev/sdb MBR could NOT be precleared == == out4= 00000 == out5= 00000 ============================================================================ 0+1 records in 0+1 records out 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000700 0000 0000 0000 0040 0000 8870 e8e0 0000716 462 bytes (462 B) copied, 5.6228e-05 s, 8.2 MB/s root@Tower:/boot#
January 19, 201313 yr Basically, it did not find what was expected when it read the pre-cleared disk back. It could be almost anything, from an actual failed disk, to a loose cable. Joe L.
January 19, 201313 yr Author Broad range. Searching around showed few people getting this error. The most recent case was in fact a failing drive.I have downloaded the wd data lifeguard tools and will see what they have to say about the drive. Edit: From the log, looks like there are a lot of IO errors. Perhaps the drive is on its way. Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327474 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327475 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327476 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327477 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327478 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb
January 19, 201313 yr Broad range. Searching around showed few people getting this error. The most recent case was in fact a failing drive.I have downloaded the wd data lifeguard tools and will see what they have to say about the drive. Edit: From the log, looks like there are a lot of IO errors. Perhaps the drive is on its way. Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327474 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327475 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327476 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327477 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Jan 3 16:31:05 Tower kernel: Buffer I/O error on device sdb, logical block 488327478 Jan 3 16:31:05 Tower kernel: lost page write due to I/O error on sdb Or the cable came loose... Let us know what you find. Joe L.
January 20, 201313 yr Author I stuck the drive in my dock (so no cable issues) and ran data lifeguard extended test which also failed. Looks like the drive is pooched. Better to catch now though. Thanks.
January 20, 201313 yr Looks like the drive is pooched. Better to catch now though. I'm sorry, but happy you found it before trusting your data on it.
January 20, 201313 yr Author So I guess my last question on this topic is - this was my former parity drive. What happens in this case? Parity checks when it was in use came back with no errors. What would that mean for the data it was supposedly parity-ing? (Now I'm just making up words).
January 20, 201313 yr Author Hrm, so I pulled the drive and rather than do the initconfig, I used the "New Config" utility in Unraid 5. This did not work as it did on my testbed... I re-assigned all the drives to the same spots they were, less the 2TB drive that failed me and now it's insisting to do a parity check, which makes sense as parity was disgarded. My only question is it's showing tons of sync errors corrected. I'm hoping that's to be expected since the one drive (despite being emptied) is no longer there?
January 21, 201313 yr Author Last checked on Mon Jan 21 08:32:40 2013 EST, finding 464502929 errors. Syslog attached. syslog-2013-01-21.txt
January 24, 201313 yr Author Well, I added another 2TB drive back (aside from the one I RMA'd - - this was my drive for storing system backups I replaced with another 3TB drive; cleared, tested using WD Data Lifeguard, passed & inserted as another data drive) and ran parity once more to ensure the last corrective parity run did as was advertised. Parity check finished with no errors this time, however, said "backup" 2TB drive after being formatted and brought back into the array is now showing 1568 errors! Can't catch a break, it seems. Latest syslog attached. I just don't get how it passed WD Data Lifeguard extended test and (I think) I ran pre-clear on this drive too... and only now it's showing errors? There's no data on it either, which is odd. I can try re-seating the cable... but how would I best test the drive before trusting data to it? syslog-2013-01-23.zip
January 25, 201313 yr Author Attached. Says it passed the self test... I just find it odd that it had read errors when the drive is empty. smart24.txt
January 25, 201313 yr Attached. Says it passed the self test... I just find it odd that it had read errors when the drive is empty. Actually, it says no self-test has ever been requested and there are 4 sectors marked for possible re-allocation as their checksum at the end of their respective sectors did not match the contents. 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 4 Even on a completely empty unformatted disk, every sector in the partition is read when calculating parity. Most disks have several thousand sector spare for reallocation. You need to "write" to those sectors to get them to be re-allocated. Easiest way is have unRAID re-construct the disk onto itself. It will use parity in combination with all the other data disks to write the correct contents to the 4 unreadable sectors. Joe L.
January 25, 201313 yr Author Sorry, I meant I noticed "SMART overall-health self-assessment test result: PASSED" when skimming. I've unassigned and reassigned the disk and the rebuild should be done by morning. I'll have a look at the smart values once complete and hopefully it won't return any more sectors pending reallocation so I can move on to (finally, been weeks now) playing around with unraid 5!
January 26, 201313 yr Author Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Looks good now. Hopefully it stays that way as I start to load it up! Thanks for the help.
Archived
This topic is now archived and is closed to further replies.