Vr2Io Posted May 7, 2021 Share Posted May 7, 2021 1 minute ago, hansolo77 said: 6 GB/s. It is 6Gb, not 6GB, so two sff8087 have 8*6Gb bandwidth simple count as 4.8GB/s, and 2308 is a pcie 3.0 x8 HBA, so max were ~ 8GB/s. Pls also check does your HBA running in PCIe 3.0 x8. Quote Link to comment
hansolo77 Posted May 7, 2021 Author Share Posted May 7, 2021 It should be, I only have 3 PCIe 3.0 slots, one is used by the P2200 video card and the other is the HBA. Prior to all this, I also had that 10 gig nic installed on a slot too. Removing it should have helped the bandwith. Quote Link to comment
Vr2Io Posted May 7, 2021 Share Posted May 7, 2021 Just now, hansolo77 said: Removing it should have helped the bandwith. Note, in general no help. Quote Link to comment
Vr2Io Posted May 7, 2021 Share Posted May 7, 2021 (edited) 1 hour ago, hansolo77 said: I don't think the controller has a fan on it, but I do have a fan blowing air across the entirety of my expansion cards. I also have all my fans set to be on full power rather than ramp up based on temps. I haven't install fan on all hba or 10gNIC, but you need confirm enough alrflow or working temperature. I have one LSI 2308 dead due to hightemp or not enough airflow. It just dead suddenly during operation and not cause intermittent problem, it is my fault on some cooling issue. Edited May 7, 2021 by Vr2Io Quote Link to comment
hansolo77 Posted May 7, 2021 Author Share Posted May 7, 2021 Wow, it took about 20 minutes to get Memtest v9 to be fully up and ready to test. About 3 minutes of black screen, then like 2-3 minutes each line of it's pre-testing detection stuff. But I've got it up and running now. I left everything at default, which is 4 passes with the CPU using all cores in parallel. I'm going to go out to lunch. When I get back, if it's already completed I'll go back into the settings and set the pass to 99 or something so it will continue to run overnight. In a matter of like 2 minutes, it was already reporting like 48% of the first pass was completed lol. So far no errors. Quote Link to comment
Vr2Io Posted May 7, 2021 Share Posted May 7, 2021 (edited) Good 34 minutes ago, hansolo77 said: set the pass to 99 Each pass should 25min+ if I haven't remember wrong (16GB RAM, but you have 64GB). Two 4 pass I think enough, I like speedup. I usually troubleshooting in that way, I like found a way which could as quick as possible to reproduce the problem first, then last after problem solve will do some long test to ensure stability. Edited May 7, 2021 by Vr2Io Quote Link to comment
hansolo77 Posted May 7, 2021 Author Share Posted May 7, 2021 I'm back home now.. it says it's been running for about 3 hours. It's still in pass #2. So probably won't finish 4 passes for another few hours at least. So far, no issues though. Quote Link to comment
Vr2Io Posted May 7, 2021 Share Posted May 7, 2021 (edited) Good again, next, you need decide rerun parity correction check ( stop and restart until error free ) with or without rule out some hardware first. Or at least swap the PCIe slot with another add on card, this no cost. If still randomly error occur then suggest change the HBA. Edited May 7, 2021 by Vr2Io Quote Link to comment
hansolo77 Posted May 9, 2021 Author Share Posted May 9, 2021 After passing all the RAM tests, I've ran another non-correcting parity check. It passed as well. Starting a regular check now. Quote Link to comment
Vr2Io Posted May 9, 2021 Share Posted May 9, 2021 Nothing have change but normal now ? Um.... Quote Link to comment
hansolo77 Posted May 9, 2021 Author Share Posted May 9, 2021 Yeah go figure. They called me in to work today so I’m not able to directly watch the check but I’ve been looking at it through My Servers off and on. So far so good. About 4.5 hours in and no errors yet (knock on wood). Quote Link to comment
hansolo77 Posted May 10, 2021 Author Share Posted May 10, 2021 I've got about 2.5 hours left on this correcting parity check. 0 errors so far. Should I do another check after this to make sure it's still good, or would you think I'm good as it is (having done a non-correcting check then this correcting check) ? Quote Link to comment
JorgeB Posted May 10, 2021 Share Posted May 10, 2021 If it doesn't find any errors now I would assume problem is fixed, then wait until next scheduled check. Quote Link to comment
hansolo77 Posted May 10, 2021 Author Share Posted May 10, 2021 Just now, JorgeB said: If it doesn't find any errors now I would assume problem is fixed, then wait until next scheduled check. Cool. I typically have my scheduled checks on the first of the month. Should I maybe do a scheduled check in a week from now to be safe? Quote Link to comment
JorgeB Posted May 10, 2021 Share Posted May 10, 2021 It won't hurt, make sure schedule checks are non correct. Quote Link to comment
hansolo77 Posted May 10, 2021 Author Share Posted May 10, 2021 Just now, JorgeB said: It won't hurt, make sure schedule checks are non correct. Why would you want it to non-correct? I've seen this a few times.. Wouldn't fixing errors it finds be better than just reporting there are errors? Quote Link to comment
JorgeB Posted May 10, 2021 Share Posted May 10, 2021 Unless errors are expected it should be non correct, because in some rare cases if there are disk problems a correcting check can wrongly update parity ant corrupt it. Quote Link to comment
hansolo77 Posted May 10, 2021 Author Share Posted May 10, 2021 Ahhh... so a non-safe shutdown should be ON, but just scheduled checks should be OFF to prevent corruption. Got it! Thanks! Quote Link to comment
JonathanM Posted May 10, 2021 Share Posted May 10, 2021 7 minutes ago, hansolo77 said: Ahhh... so a non-safe shutdown should be ON, but just scheduled checks should be OFF to prevent corruption. Got it! Thanks! Technically the safest action is NOT to write to any disks until you are sure what you are writing is correct, but in the case of a reboot without stopping the array first, parity has a very good chance of not staying in sync if there were writes occurring when the rug got yanked, because Unraid always tends to the filesystem writes first and then sends the parity writes. Parity must be correct to successfully emulate failed drives, so statistically it's better to correct after an unclean shutdown, but in the absence of a known event that would knock parity out of sync, it's better to try to find the reason before blindly writing data. It's a matter of which makes the most sense in most cases, there can be edge cases as always. Quote Link to comment
hansolo77 Posted May 10, 2021 Author Share Posted May 10, 2021 (edited) I installed an application called "CA Auto Turbo Write Mode". Is this something I should uninstall? It was a recommended addon from SpaceInvaderOne. Not knowing much about how the file transfer stuff works I just blindly installed it. Could this have created parity corruption as well? I'm grasping at straws here trying to figure out what happened, and have literally disabled EVERYTHING at this point. Edited May 10, 2021 by hansolo77 Quote Link to comment
codefaux Posted May 10, 2021 Share Posted May 10, 2021 It's important to understand what the options are and why. Turbo Write updates parity by reading the block from every disk in your array, doing math, then writing to parity. This requires all of your disks to be spun up. Some people run spun 24/7, some drive models will refuse to spin down, etc etc -- Turbo Write is faster but only if it doesn't have to wait for a disk to spin up. The non-Turbo write updates parity by reading the block from parity, doing math to change it to what it should be, and writing it back. This works even if most of your drives are spun down, but can slow things down due to the time it takes to update parity. The Auto Turbo Write plugin switches back and forth depending on wether or not your disks are all spinning. It technically should not cause any negative effects, and technically should use the best case at all times, but nothing is perfect. It's unlikely you'll need/want to remove it specifically unless you know better. Parity corruption only happens when a disk is written to but parity is not updated, or when a data path is corrupted in transit (bad RAM/etc) or when a disk is failing somewhere. Hard shutdown, direct disk access outside the array, improper procedure repairing filesystem damage, etc etc.. Quote Link to comment
hansolo77 Posted August 15, 2022 Author Share Posted August 15, 2022 Just reporting back in. The problems I had were all solved. I have been sitting happy with no errors for months. That is, until the start of this month. Started getting errors again. Rather than updating this thread, which is no longer related to shut down issues, I've created a new thread for my new troubles. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.