Parity Updated Following Drive Recovery


Recommended Posts

In the past year or so, I have rarely turned on my unRaid server (4.7Pro) - mainly because I have been out of state for extended periods.

 

After a clean parity check, I added a disk and wrote several hundred G of data to the array.

Before I left I ran a parity check (NOCORRECT) and it showed parity updated 4 times, which I understand means that the parity verification thread detected 4 parity mismatches but no actual updates occurred.  I looked at the SMART reports and only disk5 showed 16 pending sectors, 0 reallocated events, and its short offline test completed without error.  No time to debug further.

 

Back in town, I reran the parity NOCORRECT and it showed 1 sync error updated, and the syslog window showed handle_stripe read error; disk1 read error.  I cancelled the parity check.  Checked the SMART report for Disk1 and it showed 9 pending and 5 reallocated events, and the short SMART test showed read failure.  Disk 5 still showed 16 pending sectors, but its log and short test was clean.  Since I didn't have a replacement drive available I couldn't attend to the problem and shut the array down.

 

I finally replaced disk1 and rebuilt the array.  Upon completion, I get a message that the last parity check <1 day ago Parity updated 1 time to address sync errors.  Rebuilding a disk only reads the parity and the other disks to write to the replacement, and the parity drive has still not been updated, right?  So where did this parity error come from?  Is it from a disk5 read error and if so, chances are that the rebuilt drive has at least 1 bit in error, right?  The syslog doesn't seem to show any errors from the rebuild. 

 

Jan  4 23:39:58 Tower emhttp: unRAID System Management Utility version 4.7
Jan  4 23:39:58 Tower emhttp: Copyright (C) 2005-2011, Lime Technology, LLC
Jan  4 23:39:58 Tower emhttp: Pro key detected, GUID: 05DC-A560-1010-153813190906
Jan  4 23:39:58 Tower emhttp: shcmd (1): udevadm settle
Jan  4 23:39:58 Tower emhttp: Device inventory:
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.2-scsi-0:0:0:0 host3 (sdb) Hitachi_HDS723015BLA642_MN1B20F304G19D
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.2-scsi-0:0:1:0 host3 (sdc) ST1500DL003-9VT16L_5YD8YMY3
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.2-scsi-1:0:0:0 host4 (sdd) Hitachi_HDS723015BLA642_MN1B21F303G5BD
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.2-scsi-1:0:1:0 host4 (sde) ST1500DL003-9VT16L_5YD8ZKC2
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.5-scsi-0:0:0:0 host5 (sdf) Hitachi_HDS5C3015ALA632_ML0020F002NZ8D
Jan  4 23:39:58 Tower emhttp: pci-0000:00:1f.5-scsi-1:0:0:0 host6 (sdg) SAMSUNG_HD154UI_S1Y6J1KS802855
Jan  4 23:39:58 Tower emhttp: pci-0000:02:00.0-scsi-0:0:0:0 host0 (sda) Hitachi_HDS723015BLA642_MN1B21F301SEVA
Jan  4 23:39:58 Tower emhttp: shcmd (2): modprobe -rw md-mod 2>&1 | logger
Jan  4 23:39:58 Tower emhttp: shcmd (3): modprobe md-mod super=/boot/config/super.dat slots=8,16,8,48,8,32,8,64,8,80,8,96,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 2>&1 | logger
Jan  4 23:39:58 Tower kernel: xor: automatically using best checksumming function: pIII_sse
Jan  4 23:39:58 Tower unmenu-status: Starting unmenu web-server
Jan  4 23:39:58 Tower kernel:    pIII_sse  :  8869.600 MB/sec
Jan  4 23:39:58 Tower kernel: xor: using function: pIII_sse (8869.600 MB/sec)
Jan  4 23:39:58 Tower kernel: md: unRAID driver 1.1.1 installed
Jan  4 23:39:58 Tower kernel: md: import disk0: [8,16] (sdb) Hitachi HDS72301 MN1B20F304G19D size: 1465138552
Jan  4 23:39:58 Tower kernel: md: import disk1: [8,48] (sdd) Hitachi HDS72301 MN1B21F303G5BD size: 1465138552
Jan  4 23:39:58 Tower kernel: md: disk1 wrong
Jan  4 23:39:58 Tower kernel: md: import disk2: [8,32] (sdc) ST1500DL003-9VT1 5YD8YMY3 size: 1465138552
Jan  4 23:39:58 Tower kernel: md: import disk3: [8,64] (sde) ST1500DL003-9VT1 5YD8ZKC2 size: 1465138552
Jan  4 23:39:58 Tower kernel: md: import disk4: [8,80] (sdf) Hitachi HDS5C301 ML0020F002NZ8D size: 1465138552
Jan  4 23:39:58 Tower kernel: md: import disk5: [8,96] (sdg) SAMSUNG HD154UI  S1Y6J1KS802855       size: 1465138552
Jan  4 23:39:58 Tower kernel: md: import disk6: [8,0] (sda) Hitachi HDS72301 MN1B21F301SEVA size: 1465138552
Jan  4 23:39:58 Tower kernel: mdcmd (1): set md_num_stripes 1280
Jan  4 23:39:58 Tower kernel: mdcmd (2): set md_write_limit 768
Jan  4 23:39:58 Tower kernel: mdcmd (3): set md_sync_window 288
Jan  4 23:39:58 Tower kernel: mdcmd (4): set spinup_group 0 0
Jan  4 23:39:58 Tower kernel: mdcmd (5): set spinup_group 1 0
Jan  4 23:39:58 Tower kernel: mdcmd (6): set spinup_group 2 64
Jan  4 23:39:58 Tower kernel: mdcmd (7): set spinup_group 3 0
Jan  4 23:39:58 Tower kernel: mdcmd (: set spinup_group 4 0
Jan  4 23:39:58 Tower kernel: mdcmd (9): set spinup_group 5 0
Jan  4 23:39:58 Tower kernel: mdcmd (10): set spinup_group 6 4
Jan  4 23:39:58 Tower emhttp: Spinning up all drives...
Jan  4 23:39:58 Tower kernel: mdcmd (11): spinup 0
Jan  4 23:39:58 Tower kernel: mdcmd (12): spinup 1
Jan  4 23:39:58 Tower kernel: mdcmd (13): spinup 2
Jan  4 23:39:58 Tower kernel: mdcmd (14): spinup 3
Jan  4 23:39:58 Tower kernel: mdcmd (15): spinup 4
Jan  4 23:39:58 Tower kernel: mdcmd (16): spinup 5
Jan  4 23:39:58 Tower kernel: mdcmd (17): spinup 6
Jan  4 23:39:59 Tower emhttp: stale configuration
Jan  4 23:39:59 Tower emhttp: shcmd (4): rm /etc/samba/smb-shares.conf >/dev/null 2>&1
Jan  4 23:39:59 Tower emhttp: _shcmd: shcmd (4): exit status: 1
Jan  4 23:39:59 Tower emhttp: shcmd (5): cp /etc/exports- /etc/exports
Jan  4 23:39:59 Tower emhttp: shcmd (6): killall -HUP smbd
Jan  4 23:39:59 Tower emhttp: shcmd (7): /etc/rc.d/rc.nfsd restart | logger
Jan  4 23:40:00 Tower emhttp: shcmd (7): cp /var/spool/cron/crontabs/root- /var/spool/cron/crontabs/root
Jan  4 23:40:00 Tower emhttp: shcmd (: echo '# Generated mover schedule:' >>/var/spool/cron/crontabs/root
Jan  4 23:40:00 Tower emhttp: shcmd (9): echo '40 3 * * * /usr/local/sbin/mover 2>&1 | logger' >>/var/spool/cron/crontabs/root
Jan  4 23:40:00 Tower emhttp: shcmd (10): crontab /var/spool/cron/crontabs/root
Jan  4 23:40:05 Tower ntpd[1437]: synchronized to 204.9.54.119, stratum 1
Jan  4 23:40:04 Tower ntpd[1437]: time reset -0.863208 s
Jan  4 23:44:15 Tower emhttp: shcmd (12): /usr/local/sbin/set_ncq sdb 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (13): /usr/local/sbin/set_ncq sdd 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (14): /usr/local/sbin/set_ncq sdc 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (15): /usr/local/sbin/set_ncq sde 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (16): /usr/local/sbin/set_ncq sdf 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (17): /usr/local/sbin/set_ncq sdg 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: shcmd (18): /usr/local/sbin/set_ncq sda 1 >/dev/null
Jan  4 23:44:15 Tower emhttp: writing mbr on disk 1 (/dev/sdd) with partition 1 offset 64
Jan  4 23:44:15 Tower emhttp: re-reading /dev/sdd partition table
Jan  4 23:44:15 Tower kernel:  sdd: sdd1
Jan  4 23:44:16 Tower kernel: mdcmd (18): start UPGRADE_DISK
Jan  4 23:44:16 Tower kernel: unraid: allocating 38840K for 1280 stripes (7 disks)
Jan  4 23:44:16 Tower kernel: md1: running, size: 1465138552 blocks
Jan  4 23:44:16 Tower kernel: md2: running, size: 1465138552 blocks
Jan  4 23:44:16 Tower kernel: md3: running, size: 1465138552 blocks
Jan  4 23:44:16 Tower kernel: md4: running, size: 1465138552 blocks
Jan  4 23:44:16 Tower kernel: md5: running, size: 1465138552 blocks
Jan  4 23:44:16 Tower kernel: md6: running, size: 1465138552 blocks
Jan  4 23:44:17 Tower emhttp: shcmd (19): udevadm settle
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk4
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk5
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk1
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk3
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk2
Jan  4 23:44:17 Tower emhttp: shcmd (20): mkdir /mnt/disk6
Jan  4 23:44:17 Tower kernel: mdcmd (19): check 
Jan  4 23:44:17 Tower kernel: md: recovery thread woken up ...
Jan  4 23:44:17 Tower kernel: md: recovery thread rebuilding disk1 ...
Jan  4 23:44:17 Tower emhttp: shcmd (21): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md4 /mnt/disk4 2>&1 | logger
Jan  4 23:44:17 Tower emhttp: shcmd (22): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md3 /mnt/disk3 2>&1 | logger
Jan  4 23:44:17 Tower emhttp: shcmd (23): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md6 /mnt/disk6 2>&1 | logger
Jan  4 23:44:17 Tower emhttp: shcmd (24): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md2 /mnt/disk2 2>&1 | logger
Jan  4 23:44:17 Tower emhttp: shcmd (25): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md5 /mnt/disk5 2>&1 | logger
Jan  4 23:44:17 Tower emhttp: shcmd (26): set -o pipefail ; mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md1 /mnt/disk1 2>&1 | logger
Jan  4 23:44:17 Tower kernel: md: using 1152k window, over a total of 1465138552 blocks.
Jan  4 23:44:17 Tower kernel: REISERFS (device md6): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md6): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md4): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md4): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md3): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md3): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md1): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md1): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md2): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md2): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md5): found reiserfs format "3.6" with standard journal
Jan  4 23:44:17 Tower kernel: REISERFS (device md5): using ordered data mode
Jan  4 23:44:17 Tower kernel: REISERFS (device md6): journal params: device md6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md6): checking transaction log (md6)
Jan  4 23:44:17 Tower kernel: REISERFS (device md4): journal params: device md4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md4): checking transaction log (md4)
Jan  4 23:44:17 Tower kernel: REISERFS (device md3): journal params: device md3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md3): checking transaction log (md3)
Jan  4 23:44:17 Tower kernel: REISERFS (device md2): journal params: device md2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md2): checking transaction log (md2)
Jan  4 23:44:17 Tower kernel: REISERFS (device md5): journal params: device md5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md5): checking transaction log (md5)
Jan  4 23:44:17 Tower kernel: REISERFS (device md1): journal params: device md1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Jan  4 23:44:17 Tower kernel: REISERFS (device md1): checking transaction log (md1)
Jan  4 23:44:17 Tower kernel: REISERFS (device md4): Using r5 hash to sort names
Jan  4 23:44:17 Tower kernel: REISERFS (device md6): Using r5 hash to sort names
Jan  4 23:44:17 Tower kernel: REISERFS (device md3): Using r5 hash to sort names
Jan  4 23:44:17 Tower kernel: REISERFS (device md5): Using r5 hash to sort names
Jan  4 23:44:17 Tower kernel: REISERFS (device md2): Using r5 hash to sort names
Jan  4 23:44:17 Tower kernel: REISERFS (device md1): Using r5 hash to sort names
Jan  4 23:44:18 Tower emhttp: shcmd (32): rm /etc/samba/smb-shares.conf >/dev/null 2>&1
Jan  4 23:44:18 Tower emhttp: shcmd (33): cp /etc/exports- /etc/exports
Jan  4 23:44:18 Tower emhttp: shcmd (34): mkdir /mnt/user
Jan  4 23:44:18 Tower emhttp: shcmd (35): /usr/local/sbin/shfs /mnt/user  -o noatime,big_writes,allow_other,default_permissions
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/DVD.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/FromTOS1000.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/PBS.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/Q9400-DDrive.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/Sam154.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/TV.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: get_config_idx: fopen /boot/config/shares/VRDsave-G5BD.cfg: No such file or directory - assigning defaults
Jan  4 23:44:30 Tower emhttp: shcmd (36): killall -HUP smbd
Jan  4 23:44:30 Tower emhttp: shcmd (37): /etc/rc.d/rc.nfsd restart | logger
Jan  4 23:48:43 Tower ntpd[1437]: synchronized to 204.9.54.119, stratum 1
Jan  5 08:24:30 Tower kernel: md: sync done. time=31212sec rate=46941K/sec
Jan  5 08:24:30 Tower kernel: md: recovery thread sync completion status: 0

 

I started a new parity check (NOCORRECT) and right away it shows:  sync errors 3 (corrected).

 

Am I right to assume these 3 sync errors are due to disk5?

 

I guess my next step was to preclear the drive I removed to try and get the pending sectors reallocated and use that drive to replace drive 5 and rebuild.

 

Any guidance would be very much appreciated

 

Thanks

Ed

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.