McGeeked Posted August 8, 2012 Share Posted August 8, 2012 Server Specs 5 3TB Western Digital Green (WD30EZRX) 3x Supermicro AOC-SAS2LP-MV8 Intel i3-2120 (3.3GHz) Supermicro MBD-X9SCM-F-O 8GB Kingston (DDR3 1333) Corsair AX850 OS - unRAID Server version 5.0-rc5 AiO Good afternoon. I am trying my first unRAID build and am extremely new to the product. I just installed all of the above hardware and everything booted up fine. I was able to get the initial stages of unRAID setup using the older configuration guide. I also installed the unMENU (which is working great) have sSMTP enabled and working. I also have the preclear_disk.sh script on the root of the flash drive. As I have done my reading ahead of time, I would have typically posted my syslog, however here is my problem. After starting the preclear process on one of my drives, it runs perfect for a short amount of time and then all of a sudden my syslog gets slammed with errors, filling up my log extremely fast and subsequently crashing my server, at which point the preclear telnet updates freeze (at this point I am not even sure if its running). I also get locked out of the web gui etc. My syslog hit over 2GB before I tried to save to give you an idea. When I first started, I initially ran 4 preclears at the same time using screen, but when I woke up I found that this issue was occurring so I needed to restart. I toned it down at first thinking that maybe 4 was overkill and that was causing the crash, but it appears to be the syslog errors. I am in the process of trying another preclear on a single drive to see how far I get, but if I want to run 3 cycles on each drive I’d rather not have to do it one at a time, so I am trying to see if anyone else has seen this or might know what is going on. I am going to try to capture the syslog before it gets to big this time and I will see if I can get it uploaded. I know that there were issues with the SAS2LP-MV8’s prior to version 5, but I was informed that 5 fixed these problems. I attached my syslog before the preclear process just incase one of you can identify something before the preclear process. Appreciate any help – Thanks. syslog-2012-08-08.zip Link to comment
Thornwood Posted August 8, 2012 Share Posted August 8, 2012 Try the newest test2 version. I think you need this with your config. Look for experimental test for sas Sent from my YP-G1 using Tapatalk 2 Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 Thanks, giving that a shot now. Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 Alright upgraded to unRAID Server 5.0-rc6-r8168-test2 AiO, Still the same issue occurring, since I cant even save the syslog, I took a partial copy of it before everything crashed. See word doc attached. Basically, my syslog is filling with the messages in the word doc attached. This time, once I started the preclear, the errors started immediately (or so it seemed) I waited maybe about 5 minutes and when I attempted to save my syslog it was about 1 GB at the time. I am pretty lost at this point, any help would be appreciated. Thanks syslogcopypart2.doc Link to comment
mr-hexen Posted August 8, 2012 Share Posted August 8, 2012 its only one drive causing this it seems. try reseating the sata cable and power cable. OR replace with spares if available. then try again. Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 Alright, so I took the extra two controllers out, as I was not really using them anyway. I replaced and re-seated the cables on the first controller. Ports 0-3 on the controller go to the first backplane port on my Norco 4224. Port 4-7 go to the second port on the backplane. I am not sure if this actually matters or not. To help my sanity, if unraid detects the drives (all 5 of them) does that mean that I have a good backplane on my case? Or no? Basically I don't want to try to swap them out if I don't have to. But there have been reviews on this case that some people have reported problems with them, but I am not sure on how to determine that. Also, if the drives are detected, wont that mean that the cabling is good also? I assume that is not the case based on your suggestion, but I have a total of 6 cables to use, I highly doubt they are all bad. I was getting some strange results when I attempted to preclear again with just the one controller. Basically the same exact thing that has been occurring, however I did not even get a chance this time to check the syslog, my unmenu crashed immediately. Link to comment
Joe L. Posted August 8, 2012 Share Posted August 8, 2012 Sorry, but your .doc is unreadable here. Please convert to a .txt or post a small sample of the error messages for analysis in your next post. Highly likely you have a bad drive, but won't know until I can see the syslog. Joe L. Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 No problem. I appreciate all of the help. See attached txt Its just one page of the errors syslog.txt Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 Also during boot my syslog is showing some error messages, see attached txt. I am attempting to remove the drive completely that I was trying initially to preclear and see how I do from there. bootwithoutsdb.txt Link to comment
McGeeked Posted August 8, 2012 Author Share Posted August 8, 2012 Here is my syslog that I was able to get before everything crashed after starting a new preclear on a completely different hard drive. (previous was removed) I don't believe its actually even starting before it crashes. preclearafterhdremoved.txt Link to comment
Joe L. Posted August 9, 2012 Share Posted August 9, 2012 Also during boot my syslog is showing some error messages, see attached txt. I am attempting to remove the drive completely that I was trying initially to preclear and see how I do from there. Those are just lines in the syslog that have the string of letters "error" in them. Those are not errors.They are describing which error handler will be used if one does occur. Link to comment
McGeeked Posted August 9, 2012 Author Share Posted August 9, 2012 Understood, so do you think its as simple as I received 5 bad hard drives? Maybe drop down to the test1 build? Link to comment
Joe L. Posted August 9, 2012 Share Posted August 9, 2012 Understood, so do you think its as simple as I received 5 bad hard drives? Maybe drop down to the test1 build? That is unlikely, although if all "drop" shipped in the same package, it could happen. It is far more likely you have a power supply issue, or a disk controller issue, or a cabling issue, or a motherboard issue. I would start with a memtest. (Just to eliminate it as an issue) Link to comment
McGeeked Posted August 9, 2012 Author Share Posted August 9, 2012 Ran the memtest, passed just fine. I also did some digging through the forums and found that others with these controllers needed to disable INT13, which I did, I also disabled PCIe OPROM in the main bios. Although I have three of the controllers, I only have one installed at the moment to see if I can just get past this initial problem. The same results, once I start the preclear, the logs that I posted occur and my syslog gets flooded and it crashes. Any other thoughts? Link to comment
bcbgboy13 Posted August 9, 2012 Share Posted August 9, 2012 Try running a preclear on disk attached to the Mainboard SATA ports - in this way you will eliminate possible bad backplanes, bad cables, problematic SAS2LP cards/Unraid versions. Make sure the preclear script you use is a recent one (supporting 3TB+ HDs). If the disks can be precleared this way then the problem is elsewhere. I am not that familiar with the Norco cases but I believe they were different "revisions", different backplanes and I do remember some posts where people experienced problems is they have powered a backplane with a single Molex connector only. And one more thing to keep in mind - people often confuse SASLP cards with yours SAS2LP and then with the LSI SAS cards (very different) - and the various Betas had problems with some of them in the past (still not fixed) - you will have to find a revision that does not have problems with SAS2LP. Link to comment
McGeeked Posted August 9, 2012 Author Share Posted August 9, 2012 Appreciate the reply. So I am getting much further directly with the SATA cables, for now, I am preclearing 4 at once and getting nothing on my syslog. But I just started it so we will see how far I get. (this is still a lot further as crashing almost immediately) I am fairly sure that others have these controllers (SAS2LP) working with the 5.0 test build. I have not read or has anyone confirmed that unRAID works with these controllers on the latest test2 build, so I will continue to try on test. Ideally, I would like to use these controllers if possible. If my information is correct, if I buy SATA controllers and use SAS to SATA cables my parity sync speeds wont be as fast as a direct SAS to SAS connection. I just don't know if I should just start replacing parts, as I guess I cant pinpoint exactly what it is. The only thing I can think of at this point is that unRAID is not behaving properly with these controllers although some have it working on this build. To have 6 SAS cables all being bad, and having that much backplane to all be bad, and have 3 controllers be bad just seems unlikely. Link to comment
tyrindor Posted August 9, 2012 Share Posted August 9, 2012 I have the exact same build and all I had to do was disable PCI-E OProm and INT13 on each of my cards. Make sure you disabled it for all slots, and disabled INT13 for all 3 cards (not just 1). I find RC6-test1 is still the best for SASLP/SAS2LP. Not test2. I assume you are using 1.13 of the preclear script? You don't need to run any extra commands. /boot/preclear_disk.sh /dev/XXX (XXX = the drive you want to preclear). If all that is done, you said it works fine if you bypass the cards and backplate, so I would assume something is either wrong with one of the backplates, the cards, or possibly even a PCI slot. Perhaps you changed something odd in the BIOS? Try updating your motherboard BIOS, that board got a new release recently 2.0a and I believe it ships with just 2.0. I would definitely try resetting stock settings after the update, and only disabling OPROM. I would not mess with boot order or anything like that, other than putting the unRAID USB first. Don't disable anything on it. Link to comment
McGeeked Posted August 9, 2012 Author Share Posted August 9, 2012 As far as I know I am using the correct version of preclear as the last comment in the file shows v 1.13. I attached it though just to be sure. I did disable INT13 and OProm while testing just 1 of the SAS controllers, unRAID could see all of my HD's during that time, it was only when I started to preclear that things went downhill. Can I make an assumption that the backplate is functional if unRAID is identifying the drives, or is there more to it than that? After this preclear with SATA is completed in about 2 hours, I am going to try the process of elimination and start moving the one controller to each PCIe slot and testing with preclear, however I don't know how granular I need to be. I.e should I test with the controller in the first x8 slot in the first backplate, if it fails, replace the cable, if that fails, move to a different backplate and test, if that fails, finally move the controller to the second PCIe slot on the first backplate, and move on from there. I am not yet familiar enough with this hardware to know where to set my troubleshooting limits. So if anyone can shed some light you could probably save me some troubleshooting time. Again, I am very grateful for the assistance and the knowledge. preclear_disk.txt Link to comment
McGeeked Posted August 14, 2012 Author Share Posted August 14, 2012 Turned about to be a BIOS issue, needed to upgrade to 2.0a from 2.0 Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.