Jump to content

Error on Parity Check


Recommended Posts

I have what I think is a small problem, my auto parity check came up with 1 error but i'm not sure what the error is.  I currently restarted the server and rerunning the parity check to see if the error comes back.  Here is part of my system log if needed I will post the whole thing when the check is done.

 

Also i am running beta6a and the server has been up for about 6 months of so.

 

Aug 27 14:11:46 IronMan emhttp_event: array_started
Aug 27 14:11:46 IronMan emhttp: Mounting disks...
Aug 27 14:11:46 IronMan emhttp: shcmd (42): mkdir /mnt/disk3
Aug 27 14:11:46 IronMan emhttp: shcmd (43): mkdir /mnt/disk1
Aug 27 14:11:46 IronMan emhttp: shcmd (45): mkdir /mnt/disk4
Aug 27 14:11:46 IronMan emhttp: shcmd (44): mkdir /mnt/disk2
Aug 27 14:11:46 IronMan emhttp: shcmd (46): mkdir /mnt/disk5
Aug 27 14:11:46 IronMan emhttp: shcmd (47): mkdir /mnt/disk7
Aug 27 14:11:46 IronMan emhttp: shcmd (47): mkdir /mnt/disk6
Aug 27 14:11:46 IronMan emhttp: shcmd (49): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md1 /mnt/disk1 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (48): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md3 /mnt/disk3 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (50): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md7 /mnt/disk7 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (51): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md2 /mnt/disk2 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (52): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md5 /mnt/disk5 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (53): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md6 /mnt/disk6 2>&1 |logger
Aug 27 14:11:46 IronMan emhttp: shcmd (54): set -o pipefail ; mount -t reiserfs -o noatime,nodiratime /dev/md4 /mnt/disk4 2>&1 |logger
Aug 27 14:11:46 IronMan kernel: mdcmd (42): check NOCORRECT
Aug 27 14:11:46 IronMan kernel: md: recovery thread woken up ...
Aug 27 14:11:46 IronMan kernel: md: recovery thread has nothing to resync
Aug 27 14:11:46 IronMan kernel: REISERFS (device md7): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md7): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md2): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md2): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md5): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md5): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md1): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md1): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md3): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md3): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md4): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md4): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md6): found reiserfs format "3.6" with standard journal
Aug 27 14:11:46 IronMan kernel: REISERFS (device md6): using ordered data mode
Aug 27 14:11:46 IronMan kernel: REISERFS (device md7): journal params: device md7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md7): checking transaction log (md7)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md5): journal params: device md5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md5): checking transaction log (md5)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md1): journal params: device md1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md1): checking transaction log (md1)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md6): journal params: device md6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md6): checking transaction log (md6)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md2): journal params: device md2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md2): checking transaction log (md2)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md4): journal params: device md4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md4): checking transaction log (md4)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md3): journal params: device md3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
Aug 27 14:11:46 IronMan kernel: REISERFS (device md3): checking transaction log (md3)
Aug 27 14:11:46 IronMan kernel: REISERFS (device md7): Using r5 hash to sort names
Aug 27 14:11:46 IronMan kernel: REISERFS (device md6): Using r5 hash to sort names
Aug 27 14:11:46 IronMan kernel: REISERFS (device md5): Using r5 hash to sort names
Aug 27 14:11:46 IronMan emhttp: shcmd (55): chmod 770 '/mnt/disk7'
Aug 27 14:11:46 IronMan emhttp: shcmd (56): chown nobody:users '/mnt/disk7'
Aug 27 14:11:46 IronMan kernel: REISERFS (device md4): Using r5 hash to sort names
Aug 27 14:11:46 IronMan kernel: REISERFS (device md1): Using r5 hash to sort names
Aug 27 14:11:46 IronMan kernel: REISERFS (device md3): Using r5 hash to sort names
Aug 27 14:11:46 IronMan kernel: REISERFS (device md2): Using r5 hash to sort names
Aug 27 14:11:47 IronMan emhttp: shcmd (57): chmod 770 '/mnt/disk6'
Aug 27 14:11:47 IronMan emhttp: shcmd (58): chown nobody:users '/mnt/disk6'
Aug 27 14:11:47 IronMan emhttp: shcmd (59): chmod 770 '/mnt/disk5'
Aug 27 14:11:47 IronMan emhttp: shcmd (60): chown nobody:users '/mnt/disk5'
Aug 27 14:11:47 IronMan emhttp: shcmd (61): chmod 770 '/mnt/disk4'
Aug 27 14:11:47 IronMan emhttp: shcmd (62): chown nobody:users '/mnt/disk4'
Aug 27 14:11:47 IronMan emhttp: shcmd (63): chmod 770 '/mnt/disk2'
Aug 27 14:11:47 IronMan emhttp: shcmd (64): chown nobody:users '/mnt/disk2'
Aug 27 14:11:47 IronMan emhttp: shcmd (65): chmod 770 '/mnt/disk1'
Aug 27 14:11:47 IronMan emhttp: shcmd (66): chown nobody:users '/mnt/disk1'
Aug 27 14:11:47 IronMan emhttp: shcmd (67): chmod 770 '/mnt/disk3'
Aug 27 14:11:47 IronMan emhttp: shcmd (68): chown nobody:users '/mnt/disk3'
Aug 27 14:11:47 IronMan emhttp: shcmd (69): mkdir /mnt/user
Aug 27 14:11:47 IronMan emhttp: shcmd (70): /usr/local/sbin/shfs /mnt/user -disks 254  -o noatime,big_writes,allow_other,default_permissions 
Aug 27 14:11:47 IronMan emhttp: shcmd (71): /usr/local/sbin/emhttp_event disks_mounted
Aug 27 14:11:47 IronMan emhttp_event: disks_mounted
Aug 27 14:11:47 IronMan emhttp: shcmd (72): /usr/local/sbin/emhttp_event stopping_svcs
Aug 27 14:11:47 IronMan emhttp_event: stopping_svcs
Aug 27 14:11:47 IronMan emhttp: Stop SMB...
Aug 27 14:11:47 IronMan emhttp: shcmd (73): /etc/rc.d/rc.samba stop |logger
Aug 27 14:11:47 IronMan emhttp: Stop NFS...
Aug 27 14:11:47 IronMan emhttp: shcmd (74): /etc/rc.d/rc.nfsd stop |logger
Aug 27 14:11:48 IronMan emhttp: Stop AFP...
Aug 27 14:11:48 IronMan emhttp: shcmd (75): /etc/rc.d/rc.atalk stop |logger
Aug 27 14:11:48 IronMan emhttp: Stop AVAHI...
Aug 27 14:11:48 IronMan emhttp: shcmd (76): /etc/rc.d/rc.avahidaemon stop |logger
Aug 27 14:11:48 IronMan logger: Stopping Avahi mDNS/DNS-SD Daemon: stopped
Aug 27 14:11:48 IronMan emhttp: shcmd (77): /etc/rc.d/rc.avahidnsconfd stop |logger
Aug 27 14:11:48 IronMan logger: Stopping Avahi mDNS/DNS-SD DNS Server Configuration Daemon: stopped
Aug 27 14:11:48 IronMan emhttp: shcmd (78): rm /etc/samba/smb-shares.conf >/dev/null 2>&1
Aug 27 14:11:48 IronMan emhttp: Start SMB...
Aug 27 14:11:48 IronMan emhttp: shcmd (79): /etc/rc.d/rc.samba start |logger
Aug 27 14:11:48 IronMan logger: Starting Samba:  /usr/sbin/nmbd -D
Aug 27 14:11:48 IronMan logger:                  /usr/sbin/smbd -D
Aug 27 14:11:48 IronMan emhttp: shcmd (80): /usr/local/sbin/emhttp_event svcs_started
Aug 27 14:11:48 IronMan emhttp_event: svcs_started
Aug 27 14:11:58 IronMan kernel: mdcmd (43): check NOCORRECT
Aug 27 14:11:58 IronMan kernel: md: recovery thread woken up ...
Aug 27 14:11:58 IronMan kernel: md: recovery thread checking parity...
Aug 27 14:11:58 IronMan kernel: md: using 1152k window, over a total of 1953514552 blocks.
Aug 27 14:11:59 IronMan kernel: md: parity incorrect: 22664
Aug 27 14:12:00 IronMan cache_dirs: ==============================================
Aug 27 14:12:00 IronMan cache_dirs: command-args=-w -m 1 -M 10 -d 9999 -B -a -noleaf
Aug 27 14:12:00 IronMan cache_dirs: vfs_cache_pressure=10
Aug 27 14:12:00 IronMan cache_dirs: max_seconds=10, min_seconds=1
Aug 27 14:12:00 IronMan cache_dirs: max_depth=9999
Aug 27 14:12:00 IronMan cache_dirs: command=find -noleaf
Aug 27 14:12:00 IronMan cache_dirs: version=1.6.5
Aug 27 14:12:00 IronMan cache_dirs: ---------- caching directories ---------------
Aug 27 14:12:00 IronMan cache_dirs: Dragonfly
Aug 27 14:12:00 IronMan cache_dirs: Movies
Aug 27 14:12:00 IronMan cache_dirs: Shigo
Aug 27 14:12:00 IronMan cache_dirs: TV Shows
Aug 27 14:12:00 IronMan cache_dirs: ----------------------------------------------
Aug 27 14:12:00 IronMan cache_dirs: cache_dirs process ID 5450 started, To terminate it, type: cache_dirs -q

Link to comment

Aug 27 14:11:48 IronMan emhttp_event: svcs_started
Aug 27 14:11:58 IronMan kernel: mdcmd (43): check NOCORRECT
Aug 27 14:11:58 IronMan kernel: md: recovery thread woken up ...
Aug 27 14:11:58 IronMan kernel: md: recovery thread checking parity...
Aug 27 14:11:58 IronMan kernel: md: using 1152k window, over a total of 1953514552 blocks.
Aug 27 14:11:59 IronMan kernel: md: parity incorrect: 22664

 

Is this something i should be worried about?  Can i just check the "correct any Parity-Sync erros..." and fix it?  Is there a way to get a better description of what the errors are that are found when checking Parity?

Link to comment

yes, there should not be any parity parts to correct.

 

What hardware are you running?

What version of unRAID are you running?

Have you run a memtest?

Has this happened before?

How long have you been running the server?

 

And can we please get a .txt file of the syslog attached to your next post!

 

 

Link to comment

Don’t have the exact spec here but it’s a supermicro with an i3 and 5 2TB Greens and 2 1TB Black WD.  As I said in the first post I am running beta6a and have had the server for about 7-8 months or so.  The 1 error is the first I have seen and it seems to be the only one so far (cross fingers).  I have not seen any noticeable problems with the server and have not run any memetests on it.

 

The syslog is attached

syslog.txt

Link to comment

There is nothing you can do except run a correcting parity check unless you have checksums on all your files.

 

There is almost no way to know which file/files are involved, or even if it is in a file.  The block number of the first few blocks with errors are printed in the syslog, but not all.

 

There is a bug in EVERY unRAID version prior to 5.0beta11 (I think) that allows error to occur if you are writing to the array when calculating parity or when adding a new disk.  It was an error in early linux "md" devices upon which unRAID was based.    It was very elusive since it only showed if you were writing the exact block also being checked/calculated at the same time.  according to a google search, all kernels prior to 2.6.32 seem to have the bug.

Link to comment
  • 3 weeks later...

If I understand the explanation correctly, unless the kernel is fixed or unRAID "patches" the errant code, I would avoid accessing the server in any way while building/checking parity.  I had multiple parity errors when I upgraded my parity drive due to accessing the array heavily (both reads and writes) during the parity rebuild process.  Testing that process while refraining from any server access resulted in no parity errors.

 

So until this issues is officially and confirmed resolved, I will always stop access during parity checks/builds.

Link to comment

If I understand the explanation correctly, unless the kernel is fixed or unRAID "patches" the errant code, I would avoid accessing the server in any way while building/checking parity.  I had multiple parity errors when I upgraded my parity drive due to accessing the array heavily (both reads and writes) during the parity rebuild process.  Testing that process while refraining from any server access resulted in no parity errors.

 

So until this issues is officially and confirmed resolved, I will always stop access during parity checks/builds.

Reading is not an issue, but writing to the array might be.

 

What happens is that the parity calculation process can inadvertently ignore the "dirty" flag on a disk block being written.  This prevents the block from being written to the data disk.  It is rare, and elusive, since you have to be writing the block at the same time parity is processing it.  (In my case, parity takes nearly 15 hours on my older server... to write to the exact same block at the exact same time is highly unlikely.)  It is why the error took so many years to be identified and subsequently fixed in the linux "md" driver.

 

An old post describing the same error (I think) in a lot more technical detail is here: http://www.spinics.net/lists/raid/msg33994.html

Link to comment
Reading is not an issue, but writing to the array might be.

 

Yes, but since I connect to unRAID via SMB, NFS (dabbling sometimes with AFP, but performance has been too slow), I can't control what the OS that I'm using will actually do during "browsing" of the server.  OS X apparently creates invisible "ghost" files to hold OS X-specific meta-data.  My NMT Media Player may write out a "watched" flag file after a video is completely played.

 

These I can not control so tis best for me to prevent any access that I cannot guarantee a "write" process will not occur, and I would assume this precaution should apply to every one that access their unRAID servers in similar ways...

Link to comment
  • 2 weeks later...

Hello,

 

I have a similar problem, I have just moved all my data to my unraid disks (without the parity and then I have started building it. First time it stopped at 34% more or less, there where messages about errors on parity and some errors in the counter of a disk, the web page became not very responsive (minutes to refresh) and all my attemps to shutdown from webpage or console where not working.

 

After shutting down by brute force and start it again I decided to say all disk not to sleep, then I started the parity creation again, now it's near 45%, when I have comed back from the sofa and the system looked a bit irresponsive, I forced a spin up of all disks and the it wrote a little but now seems really stopped (the disk light looks off).

 

I have also reseted the counter to see the movements (but there was no erro in counters)

 

This is my current status:

 

 

Device  Identification  Temp.  Size  Free  Reads  Writes  Errors

 

parity WDC_WD20EARX-00PASB0_WD-WCAZA8431017 (sdf) 1953514552 33°C 2 TB - 0 197 0

disk1 WDC_WD15EARS-00Z5B1_WD-WMAVU2191182 (sdh) 1465138552 30°C 1.5 TB 40.34 GB 204 0 0

disk2 WDC_WD15EADS-00R6B0_WD-WCAVY0249852 (sdg) 1465138552 31°C 1.5 TB 1.05 TB 199 0 0

disk3 WDC_WD20EARS-00MVWB0_WD-WCAZA5876707 (sdd) 1953514552 29°C 2 TB 175.83 GB 210 0 0

disk4 WDC_WD20EARS-00MVWB0_WD-WMAZA0938029 (sda) 1953514552 30°C 2 TB 31.39 GB 209 0 0

disk5 WDC_WD20EARS-00MVWB0_WD-WCAZA5630745 (sdb) 1953514552 28°C 2 TB 291.48 GB 209 0 0

disk6 WDC_WD20EARS-00MVWB0_WD-WMAZA0939580 (sdc) 1953514552 30°C 2 TB 305.77 GB 209 0 0

flash USB_DISK - 2.03 GB 1.94 GB 145 17 0

 

Array Status

--------------------------------------------------------------------------------

Started

Stop will take the array off-line.

Parity-Sync in progress.

Cancel will stop Parity-Sync.

WARNING: canceling Parity-Sync will leave the array unprotected!

 

Total size:2TB

 

Current position:905.67GB (45%)

Estimated speed:139.35KB/sec

Estimated finish:130644minutes

 

Some minutes after:

 

Total size:2TB

 

Current position:906.29GB (45%)

Estimated speed:1.03 MB/sec

Estimated finish: 17632 minutes

 

 

My system:

 

As you see all these WD drives (and one more in ntfs not mounted pending to see that all data is safe)

 

The motherboard is an ASUS E35M1-M PRO (has all components including processor and VGA), 4GB RAM Kingston HyperX, and a suplementary sata controller Startech PEXSATA221.

Antec 900 Case and Tooq 700 PSU

- This system has been working with most of the disks with W2k8R2 for months without any issue (well, it never left the disks sleeping, thats all)

 

I will leave my system trying to finish while I'm sleeping (now is 2:34 in the morning Spanish time, I'm tyred :)

 

Thanks.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...