rharvey Posted August 27, 2010 Share Posted August 27, 2010 I just finished putting all the new gear into my old Stacker tower. Went with the SuperMicro C2SEA and the SuperMicro AOC-SASLP-MV8 (x2) for port expansion. I had a fully working systemk with the original Intel board that Tom used right up until I pulled it all apart 2 hours ago. I can't get the Lexar USB drive to boot even though BIOS sees it just fine and it's currently set as the only boot device. When I attempt to boot I get the typical BIOS error that no boot device found. I have moved the Lexar USB drive to various ports but no change. I'm thinking it's some BIOS setting but I have tried lots and nothing seems to change anything. Does anyone have any ideas or better yet could someone with this board give me the right BIOS settings. I'm out of the water and really could use some guidance, thanks... UPDATE - I have two Lexar USB drives (purchased the spare) and tried that too, but still no boot. Link to comment
Rajahal Posted August 28, 2010 Share Posted August 28, 2010 Try the different USB emulation modes in BIOS. USB-HDD, USB-FDD, etc. Link to comment
rharvey Posted August 28, 2010 Author Share Posted August 28, 2010 Try the different USB emulation modes in BIOS. USB-HDD, USB-FDD, etc. I have tried everyone of them (I think) and still nothing Link to comment
rharvey Posted August 28, 2010 Author Share Posted August 28, 2010 Well after a couple of hours trouble shooting this morning I'm a little closer but frankly standard trouble shooting techniques are proving to frustrate and are not providing reliable results. I'm now able to get the server to boot into unRaid with all 6 MB SATA ports connected and only 2 of the add-on cards connected. I have done what I consider to be standard trouble shooting by adding one port on each re-bbot cycle to try and find the offending drive (or cable) but there is no rhyme or reason to when or what causes it to fail. Worse yet this combination of MB and add-on cards seems to be very flaky and may prove to be troublesome if and when drives fail as it suddenly causes the system to think the USB drive is not bootable. I'm wicked frustrated.... Link to comment
rharvey Posted August 28, 2010 Author Share Posted August 28, 2010 So a little more information but still no answer.....I can get up to three ports off the add-on cards to work. Anytime I plug in a fourth port (on either card or cable and any drive) the MB does not see the USB drive as bootable. That made me think that the power supply was not good enough to power all the drives so I bridged in one of the original power supplies and had it take over power duties for six of the 12 total drives. Still when I add in the fourth drive off any of the add-on cards the box will not boot, so it's not power. Plus I went with the power supply that Tom uses in his current build so it should be fine for my 12 drive setup. So really not major progress and much continued frustration. Best I can get running is the 6 SATA ports off the MB along with 3 off one of any one of the SAS cards, anything beyong that causes the USB drive to show as not bootable. Link to comment
graywolf Posted August 28, 2010 Share Posted August 28, 2010 In BIOS, do you have the SATA in ACHI or IDE mode? Not positive, but seem to recall reading something about IDE mode and over 4 Drives then BIOS may not see all of them? Read it somewhere in the forums. Link to comment
rharvey Posted August 30, 2010 Author Share Posted August 30, 2010 In BIOS, do you have the SATA in ACHI or IDE mode? Not positive, but seem to recall reading something about IDE mode and over 4 Drives then BIOS may not see all of them? Read it somewhere in the forums. Thanks Graywolf I will give that a look and try when I get home as well as scour the AOC-SASL MV-8 thread for more info.. Link to comment
rharvey Posted August 30, 2010 Author Share Posted August 30, 2010 Did a bunch of internet searches and found what looks to be the issue. It seems that the SAS card's default setting is to have INT13h enabled so that you're able to select one of the SAS attached drives as your boot device in BIOS. The problem arises when the number of boot devices available in BIOS exceeds the maximum whereas the SAS firmware will override BIOS and steal the USB boot slot causing the system to think the selected boot device is not bootable. The fix is to turn off INT13h in the SAS cards bios, which I will try when I get home, fingers crossed. Link to comment
teamhood Posted August 30, 2010 Share Posted August 30, 2010 Wow... I hope that works for you. My AOC-SASL-MV8 is now working perfectly since a new SAS > SATA came (1 SATA port was bad). I sure hope this resolves your boot problems... Link to comment
rharvey Posted August 30, 2010 Author Share Posted August 30, 2010 UPDATE - Setting the INT13h on each of the SASL-MV8 cards did the trick, booted up first shot once they were turned off.....BUT there is one issue that came up with the migration to the new hardware. Two of the drives have come up weird and yes I did do a couple of re-boots just to make sure it would not claer itself due to bad unmounting previously. One comes up as unformatted with green light but no temp, the other comes up as unformatted too but red, see below.....any help would be great as I think #8 is empty but #9 is over 1/2 full. Link to comment
Joe L. Posted August 31, 2010 Share Posted August 31, 2010 UPDATE - Setting the INT13h on each of the SASL-MV8 cards did the trick, booted up first shot once they were turned off.....BUT there is one issue that came up with the migration to the new hardware.That is good news Two of the drives have come up weird and yes I did do a couple of re-boots just to make sure it would not claer itself due to bad unmounting previously. One comes up as unformatted with green light but no temp, the other comes up as unformatted too but red, see below.....any help would be great as I think #8 is empty but #9 is over 1/2 full. Only way to know what is happening is for you to attach a copy of your syslog. Joe L. Link to comment
rharvey Posted August 31, 2010 Author Share Posted August 31, 2010 Joe, Thanks for helping .... It's been a long time since I have needed to pull a syslog and frankly I have forgotten how to. I did a quick search but came up with nothing. It would have been good if these "how to" type of posts were sticky posts but there really is no active forum master to do that kind of work. Anyway if you could provide simply refresh instructions it would be great. Link to comment
Joe L. Posted August 31, 2010 Share Posted August 31, 2010 Joe, Thanks for helping .... It's been a long time since I have needed to pull a syslog and frankly I have forgotten how to. I did a quick search but came up with nothing. It would have been good if these "how to" type of posts were sticky posts but there really is no active forum master to do that kind of work. Anyway if you could provide simply refresh instructions it would be great. It is in the wiki under "troubleshooting" http://lime-technology.com/wiki/index.php?title=Troubleshooting#Capturing_your_syslog Link to comment
rharvey Posted September 1, 2010 Author Share Posted September 1, 2010 Here is my syslog.... syslog_9_1_2010_RPH.zip Link to comment
Joe L. Posted September 1, 2010 Share Posted September 1, 2010 The entire log is filled with these messages: Aug 31 04:40:01 Tower syslogd 1.4.1: restart. Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 04:40:05 Tower kernel: ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 04:40:05 Tower kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Aug 31 04:40:05 Tower kernel: ata2: error=0x04 { DriveStatusError } We need the prior log file (or reboot and grab a fresh one before it fills and is switched out to syslog.1) the prior log file is likely in /var/log/syslog.1 Link to comment
rharvey Posted September 1, 2010 Author Share Posted September 1, 2010 Sorry about that Joe - I re-booted last night and did not pull the syslog until this morning. This one will work....... syslog.1_RPH.zip Link to comment
Joe L. Posted September 1, 2010 Share Posted September 1, 2010 First... be VERY careful with the 4.5.3 release of unRAID. It has a MAJOR bug where ALL the drives will come up as un-formatted when you first boot the server. If you press the format button when that happens it will happily do as you instructed, format ALL your drives. The correct action instead is to simply "Stop" the array, and then press "Start" to re-start it. That is not your immediate issue, but it was the reason for a HUGE thread and an emergency release to fix it. It got a few people who were adding new drives by formatting ALL their drives when they were adding one new drive and were expecting to format only it. Please upgrade once things are stable. (For your own sanity) These are the errors that are causing the disk to not show up Aug 31 02:19:46 Tower kernel: ata2: translated ATA stat/err 0x41/04 to SCSI SK/ASC/ASCQ 0xb/00/00 Aug 31 02:19:46 Tower kernel: ata2.00: device reported invalid CHS sector 0 Aug 31 02:19:46 Tower kernel: ata2: status=0x41 { DriveReady Error } Aug 31 02:19:46 Tower kernel: ata2: error=0x04 { DriveStatusError } Aug 31 02:19:46 Tower kernel: sd 0:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Aug 31 02:19:46 Tower kernel: sd 0:0:1:0: [sdc] Sense Key : 0xb [current] [descriptor] Aug 31 02:19:46 Tower kernel: Descriptor sense data with sense descriptors (in hex): Aug 31 02:19:46 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Aug 31 02:19:46 Tower kernel: 00 00 00 00 Aug 31 02:19:46 Tower kernel: sd 0:0:1:0: [sdc] ASC=0x0 ASCQ=0x0 Aug 31 02:19:46 Tower kernel: sd 0:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 00 00 00 20 00 Aug 31 02:19:46 Tower kernel: end_request: I/O error, dev sdc, sector 0 Disk8 also seems to be affecting disk9 Aug 31 02:19:47 Tower kernel: sd 0:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x08 Aug 31 02:19:47 Tower kernel: sd 0:0:1:0: [sdc] Sense Key : 0xb [current] [descriptor] Aug 31 02:19:47 Tower kernel: Descriptor sense data with sense descriptors (in hex): Aug 31 02:19:47 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Aug 31 02:19:47 Tower kernel: 00 00 00 4e Aug 31 02:19:47 Tower kernel: sd 0:0:1:0: [sdc] ASC=0x0 ASCQ=0x0 Aug 31 02:19:47 Tower kernel: REISERFS (device md1): found reiserfs format "3.6" with standard journal Aug 31 02:19:47 Tower kernel: Aug 31 02:19:47 Tower kernel: sd 0:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 00 00 00 4f 00 00 08 00 Aug 31 02:19:47 Tower kernel: end_request: I/O error, dev sdc, sector 79 Aug 31 02:19:47 Tower kernel: REISERFS (device md1): using ordered data mode Aug 31 02:19:47 Tower kernel: md: disk8 read error Aug 31 02:19:47 Tower kernel: handle_stripe read error: 16/8, count: 1 Aug 31 02:19:47 Tower kernel: REISERFS warning (device md9): sh-2006 read_super_block: bread failed (dev md9, block 2, size 4096) Aug 31 02:19:47 Tower kernel: REISERFS warning (device md8): sh-2006 read_super_block: bread failed (dev md8, block 2, size 4096) The file-systems on disk8 and disk9 are not being found, probably because the "read" is failing. Might check if the cable to those two drives is good. Link to comment
rharvey Posted September 1, 2010 Author Share Posted September 1, 2010 Joe, I stopped and the re-started the array and there was no change in these two drives. I then replaced the breakout cable going to these two drives with an new one and still no change. Would you suggest I upgrade to the newest version now...? Link to comment
Joe L. Posted September 1, 2010 Share Posted September 1, 2010 Joe, I stopped and the re-started the array and there was no change in these two drives. I then replaced the breakout cable going to these two drives with an new one and still no change. Would you suggest I upgrade to the newest version now...? Probably would not hurt, although I think the kernel is still the same version, there may be a few newer drivers included. Link to comment
rharvey Posted September 1, 2010 Author Share Posted September 1, 2010 Upgraded to 4.5.6 and no change at all.....what would you suggest. Note that both of these drives were green and normal prior to the hardware upgrade. I even moved the breakout cable supporting these two drive to another SAS port and that did nothing either. Link to comment
Joe L. Posted September 1, 2010 Share Posted September 1, 2010 Upgraded to 4.5.6 and no change at all.....what would you suggest. Note that both of these drives were green and normal prior to the hardware upgrade. I even moved the breakout cable supporting these two drive to another SAS port and that did nothing either. Are they powered by a common cable? Link to comment
rharvey Posted September 1, 2010 Author Share Posted September 1, 2010 Yes they are, the power supply I choose is the same one that Tom uses in his current hardware build. It has two 12v molex style cables feeds however they are both connected to the same rail inside the supply. Link to comment
bcbgboy13 Posted September 1, 2010 Share Posted September 1, 2010 1. Try with a single controller first - this will give you room for up to 14 drives. 2. Then try to swap the problematic Seagate drive (on the controller) with another one from the motherboard SATA ports - if they have an older firmware they may have problems with the Supermicro controller - to see if the error will follow the drive. Link to comment
rharvey Posted September 2, 2010 Author Share Posted September 2, 2010 1. Try with a single controller first - this will give you room for up to 14 drives. 2. Then try to swap the problematic Seagate drive (on the controller) with another one from the motherboard SATA ports - if they have an older firmware they may have problems with the Supermicro controller - to see if the error will follow the drive. I have done the single SAS card thing with no change. You're right, it's time to start to move stuff around to see if the problem follows the drives or stays with the port. Thanks... Link to comment
rharvey Posted September 7, 2010 Author Share Posted September 7, 2010 Update - For those who may care I moved both of the drives that were causing issues around in the array and without question the issues followed the drives each time so it was clear that unRaid did not like them for some reason. One of the two drives was totally empty so no issue with pulling that one out. The second was a 1.5TB drive with 15 BluRay movies on it. I installed in on an SATA channel on my Windows machine and using YAReG I was able to see the drive and pull all the movies off without a single error. Makes me wonder why unRaid had such an issue with the drive. Installed two replacement drives, re-set the unRaid to new, ran a parity and all is good. Side not, my Parity checks now are lightning fast, woo hoo Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.