neogemic Posted March 5, 2018 Share Posted March 5, 2018 Hi, Since updating to version 6.4.1 I have been getting alot of udma crc errors especailly on the drive bays that are connected to my LSI card. The counts is over 1000 and it keeps generating errors whenever i parity sync. Current setup- 8 Drives connected via Sata straight to the motherboard and the rest connected using the parts below. Rosewill 3 x 5.25-Inch to 4 x 3.5-Inch Hot-swap SATAIII/SAS Hard Disk Drive Cage - Black (RSV-SATA-Cage-34) https://www.amazon.com/Rosewill-5-25-Inch-3-5-Inch-Hot-swap-SATAIII/dp/B00DGZ42SM/ref=sr_1_5?s=electronics&ie=UTF8&qid=1520262020&sr=1-5&keywords=3.5+drive+cage LSI LOGIC SAS 9207-8i Storage Controller LSI00301- https://www.amazon.com/gp/product/B0085FT2JC/ref=oh_aui_detailpage_o08_s00?ie=UTF8&psc=1 Cable Matters Internal Mini-SAS to 4x SATA Forward Breakout Cable 1.6 Feet https://www.amazon.com/gp/product/B018YHS8BS/ref=oh_aui_detailpage_o06_s01?ie=UTF8&psc=1 I even bought new sff cables the only other thing i can think off is I might need new drive bay or a Storage Controller. Appreciate the help avalon-diagnostics-20180305-0952.zip Quote Link to comment
Frank1940 Posted March 5, 2018 Share Posted March 5, 2018 (edited) OK. CRC errors as you may well be aware are usually related to cabling problems. That is always the place to start. You should take a deep breath. (As they should be harmless as far as your data is concerned. In the version 6.4.X, they have started to be tracked and there are a lot of people have 'suddenly' found them!) First, do a bit of data collection. Compile a table of the number of errors, the disks that they are on and what controller those disks are connected to. Then run a parity test and see where this increase in errors are at. That will give you a point of focus. I would suspect the cables right out of the barrel. You picked the short cable version of the SFF-8087 cable which is excellent. The next thing to consider for all SATA cables is not to tie them all together in a tight bundle for the sake of appearance. This can cause 'crosstalk' between the cables and that is the number one cause of CRC errors. (This is a bigger problems in servers than desktops since all the cables will have data on them at the same time during parity operations!) You should also make sure that all of the SATA cables are tight in their sockets and that they are fully seated (another reason to avoid bundling). Avoid the use of locking cables. If you use a locking cable, pull gently on the cable. You must be able to feel some resistance/friction as you try to pull it off. No resistance, replace the cable and deep-six it! See here for the reason for the no locking cable statement: https://support.wdc.com/knowledgebase/answer.aspx?ID=10477 The advice is from WD and is only about their drives. I suspect that the same reason the WD modified their drives is valid for other manufacturers and they may have made similar changes... Edited March 5, 2018 by Frank1940 Quote Link to comment
neogemic Posted March 6, 2018 Author Share Posted March 6, 2018 The sff cable isnt connected to directly to the hard drives but to the Drive docks (pcie sas card -> Sff cable-> 3.5 to 5.25 drive bay enclousure sata ports ->Hard Drive). I will try a direct connect and see whats happen Quote Link to comment
neogemic Posted March 10, 2018 Author Share Posted March 10, 2018 Well it looks like Unraid 6.4.1does not like my SAS 9207-8i card at all. Looks like have to stick to the sata ports on the mobo for now Quote Link to comment
JorgeB Posted March 10, 2018 Share Posted March 10, 2018 If you keep getting CRC errors It's not unRAID not liking the controller, it means there's a problem, either cables or controller. Quote Link to comment
neogemic Posted March 11, 2018 Author Share Posted March 11, 2018 I have had that Card for about 2-3 years never had issues with it. Its currently in another machine testing(so far no errors). I did buy new SFF cables and did got errors. I plan to get a new sas card(16 drive sas expander) in the fufure and see what happens then Quote Link to comment
JorgeB Posted March 11, 2018 Share Posted March 11, 2018 I am (and I'm sure many others are also) using an LSI 9207 with v6.4.1 without any issues. Quote Link to comment
Frank1940 Posted March 11, 2018 Share Posted March 11, 2018 (edited) I just noticed something. That SATA breakout out cable has locking connectors on the drive end. Read the following Tech Bulletin from WD. https://support.wdc.com/knowledgebase/answer.aspx?ID=10477 Now, you are probably saying that I am not even plugging it into a drive (If I read your setup correctly)! But you could still have the problem. Most of the locking connectors don't have the 'bumps' that force the connecting surfaces together if the lock does do the job, you have all of the conditions of an intermittent connection. The check is simple. Pull straight out on the cable, if you don't feel some friction, you don't have a reliable connection. One more contributor to CRC errors is cross talk between SATA cables. You should not tie SATA cables tightly together for neatness unless the cables are shielded. Most SATA-to-SATA cables are not shielded. I am not completely sure about the SFF-8087 type cables. IF you have tied them together have a good look at them. The one on the picture you referenced looked shielded but it is difficult to tell from a photo. Edited March 11, 2018 by Frank1940 Quote Link to comment
neogemic Posted March 11, 2018 Author Share Posted March 11, 2018 Trust me, I spent days experiment with this. I don't have another sas card yet so I can't do a comparison tests. As you see in the picture attached. Nothing is tied down(total mess). My server has a 17 drive + 1 esata drive capacity at this time 10- Sata ports are on the motherboard 8- LSI LOGIC SAS 9207-8i Storage Controller At this time i'm using this https://www.amazon.com/gp/product/B0177GBY0Y/ref=oh_aui_detailpage_o00_s00?ie=UTF8&psc=1 Until I get a new SAS card. I will have my 9207card run in my other system and see if I receive errors there as well. Quote Link to comment
John_M Posted March 11, 2018 Share Posted March 11, 2018 This is probably completely irrelevant, but what processor are you using? It's being reported as: Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Genuine Intel(R) CPU 0000 @ 1.70GHz Stepping: 1 which doesn't look right. Is it an engineering sample, by any chance? Quote Link to comment
JorgeB Posted March 11, 2018 Share Posted March 11, 2018 It's either the HBA, the cables or the enclosures, I have 24 disks connected to my 9207, and already did at least one parity check on 6.4.1 without any issues. Quote Link to comment
neogemic Posted March 11, 2018 Author Share Posted March 11, 2018 (edited) yep its a Intel Xeon E5 2609 V4 ES that i got off ebay for like $100 when my Xeon E5-2620v3 died last year. I plan to replace it with a 10 core chip around summer I will do more tests once I get another card and I'll let you know the results Thanks for the support guys appreciate it. Edited March 11, 2018 by neogemic Quote Link to comment
S80_UK Posted March 11, 2018 Share Posted March 11, 2018 (edited) I just noticed the sleaving over the cables from the HBA in your photo. That's not a huge length to be holding together, but it might not be helping in terms of crosstalk between cables. It won't be elegant, but I'd be tempted to cut that sleeving back and let the cables hang free and slightly separated. Then even if they are in close ontact, it will only be for an inch or so at a time. Edited March 11, 2018 by S80_UK Quote Link to comment
John_M Posted March 12, 2018 Share Posted March 12, 2018 3 hours ago, S80_UK said: I just noticed the sleaving over the cables from the HBA in your photo. That's not a huge length to be holding together, but it might not be helping in terms of crosstalk between cables. At least those cables are foil screened, unlike the regular SATA ones. Quote Link to comment
pwm Posted March 12, 2018 Share Posted March 12, 2018 The only time software produces ECC/CRC errors is when overclocking is involved. And since there is no overclocking of the SATA transfers there is no way unRAID can be the cause. I'm pretty sure you have had these transfer errors for a while but with 6.4.1 you got informed about them. Quote Link to comment
neogemic Posted March 12, 2018 Author Share Posted March 12, 2018 That may be the case but going to 6.4.1 was not a smooth transition. I had many things broken include my shares when upgrading to 6.4.1. So forgive me if i'm not on the 100% bug free software camp. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.