July 22, 201510 yr Hi guys, I've got a very peculiar situation. I will describe it step by step: 1. I built a new unRAID server v6.0.1. Precleared all disks - OK. 2. Transferred the data from the old one to this one. All OK. 3. Parity check - lots of errors, disk1 redballs. 4. As I'm sure in this disk, do new config, assign the disks to the array. It starts parity sync that finishes OK. 5. I do rsync dry run to check the integrity of the files on disk1 - all OK, no errors! 6. Parity check - again, lots of errors, disk1 redballs. Notice that errors *only* happen during parity check. Parity sync, data transfer, rsync all finish OK. Relevant portion of the syslog attached. syslog_partial.zip
July 22, 201510 yr Author Here it is. Quickly glancing over it, looks like parity check causes the whole controller to lock up? (attached diags removed for privacy)
July 22, 201510 yr Author PSU: 2 x Supermicro 1200W MB: Dell 0YXT71 SATA: Supermicro AOC-SAS2LP-MV8 with the latest FW Backplane: Supermicro SAS2-846EL1 RAM: 8Gb (tested OK) PS: I read about SAS2LP lockup bug, but that was resolved in Kernel 3.6.2-1.fc16 (?)
July 23, 201510 yr None of the disks are responding. There is a hardware fault in the SATA system. Check all cables and connections.
July 23, 201510 yr Author All works perfectly until I start the parity check. Even parity sync, heavy data transfer, etc. - all OK. Parity check kills it.
July 24, 201510 yr Author OK. Checked all cables - OK Re-added the "failed" disk - OK Array initiated recovery, finished OK Did rsync --dry-run to check the files with the originals - OK Parity check - same, errors galore ending in red-balling. What gives?
July 25, 201510 yr Have you run a memtest? Yep. Reply #4 above - "RAM: 8Gb (tested OK)" Sorry, I missed that. How long did you let memtest run? I would recommend overnight at a minimum, multiple passes are suggested.
August 6, 201510 yr Author OK guys, what is so special about parity check? What does it do that no other unRaid operations never do? Just to explain my point what works and what doesn't: 1. rsync to transfer data - OK 2. array recovery - OK 3. parity check - lots of errors, disk redballs, controller locks up sometimes. I even thought the old bug mentioned here raised its ugly head again ("everything works well, untill i do parity check. then i get a lot of errors." - sounds familiar?) But I reflashed the firmware like other guys did, and it didn't fix it. TL;DR: I got tired waiting for a solution, so just bough M1015, reflashed to IT mode, and all is peachy. My SAS2LP is collecting the dust now...
August 21, 201510 yr Community Expert Anyone? Your previous post said ...just bough M1015, reflashed to IT mode, and all is peachy... so what do you want from anyone?
August 21, 201510 yr Author "what is so special about parity check? What does it do that no other unRaid operations never do?" So pretty much, why SAS2LP works for every operation except parity check?
August 21, 201510 yr Community Expert "what is so special about parity check? What does it do that no other unRaid operations never do?" Parity check is the only time that you are simultaneously reading from all drives at the maximum obtainable speed.
August 21, 201510 yr Community Expert "what is so special about parity check? What does it do that no other unRaid operations never do?" So pretty much, why SAS2LP works for every operation except parity check? Reading a file only involves the drive being read. Writing a file to cache only involves the cache drive. Writing a file on the parity-protected array only involves the drive being written and the parity drive. Checking parity, or rebuilding parity, or rebuilding a data disk, all require all drives to be accessed simultaneously.
August 21, 201510 yr Author Reading a file only involves the drive being read. Writing a file to cache only involves the cache drive. Writing a file on the parity-protected array only involves the drive being written and the parity drive. Checking parity, or rebuilding parity, or rebuilding a data disk, all require all drives to be accessed simultaneously. Did you read my post? Rebuilding a data disk worked OK several times. Checking parity always killed it.
August 21, 201510 yr Community Expert Reading a file only involves the drive being read. Writing a file to cache only involves the cache drive. Writing a file on the parity-protected array only involves the drive being written and the parity drive. Checking parity, or rebuilding parity, or rebuilding a data disk, all require all drives to be accessed simultaneously. Did you read my post? Rebuilding a data disk worked OK several times. Checking parity always killed it. I didn't re-read the entire (old) thread if that's what you mean. Seemed like you said your problems were solved.
August 21, 201510 yr Check out this thread. I don't know if his fix will apply to others, but it seems worth trying, if you're in a position to purchase a new set of memory. I confess I also didn't read enough to know if his situation is the same as yours.
August 21, 201510 yr Author Check out this thread. I don't know if his fix will apply to others, but it seems worth trying, if you're in a position to purchase a new set of memory. I confess I also didn't read enough to know if his situation is the same as yours. All OK after I changed the controller. So there is something with SAS2LP and parity check only. That's what's really strange...
August 21, 201510 yr Considering how many folks use that controller with zero issues, it's pretty safe to assume either your controller is faulty, overheating, or something else is broken. You cannot making the sweeping statement that there's an issue with all SAS2LP.
Archived
This topic is now archived and is closed to further replies.