Jump to content

Time for an upgrade - suggestions?


Gizmotoy

Recommended Posts

Thanks for the suggestions.  I'd like to give the 6.x RCs a little time to cook, as a handful of minor issues came up with RC3.  Plus, I've got a pretty involved plugin/SNAP (non-cache) drive setup that's going to take some work to convert, I think.  I'll need some time to research how to get everything back up and running with Dockers and v6 plugins before I make the leap.  Might as well use that time to continue to let v6 mature and to get some burn-in on the new hardware, I think.

 

Regarding the Hitachis, probably a good call.  Might be good to make physical copies of some stuff and throw them in the fire safe.  You can never have too many backups!

Link to comment

So I'm back up and running on the new HW, and everything *seems* be working correctly, however I note the following errors in the syslog (complete syslog also attached).

 

May 31 15:17:05 Hyperion kernel: ata2.00: exception Emask 0x10 SAct 0x1 SErr 0x280100 action 0x6 frozen (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: irq_stat 0x08000000, interface fatal error (Errors)
May 31 15:17:05 Hyperion kernel: ata2: SError: { UnrecovData 10B8B BadCRC } (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: failed command: READ FPDMA QUEUED (Minor Issues)
May 31 15:17:05 Hyperion kernel: ata2.00: cmd 60/08:00:8f:01:8c/00:00:09:00:00/40 tag 0 ncq 4096 in (Drive related)
May 31 15:17:05 Hyperion kernel:          res 40/00:00:8f:01:8c/00:00:09:00:00/40 Emask 0x10 (ATA bus error) (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: status: { DRDY } (Drive related)
May 31 15:17:05 Hyperion kernel: ata2: hard resetting link (Minor Issues)
May 31 15:17:05 Hyperion kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) (Drive related)
May 31 15:17:05 Hyperion kernel: ata2.00: configured for UDMA/133 (Drive related)
May 31 15:17:05 Hyperion kernel: ata2: EH complete (Drive related)
May 31 15:17:05 Hyperion kernel: ata2: limiting SATA link speed to 3.0 Gbps (Drive related)
May 31 15:17:05 Hyperion kernel: ata2.00: exception Emask 0x10 SAct 0x1 SErr 0x280100 action 0x6 frozen (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: irq_stat 0x08000000, interface fatal error (Errors)
May 31 15:17:05 Hyperion kernel: ata2: SError: { UnrecovData 10B8B BadCRC } (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: failed command: READ FPDMA QUEUED (Minor Issues)
May 31 15:17:05 Hyperion kernel: ata2.00: cmd 60/08:00:7f:06:8c/00:00:09:00:00/40 tag 0 ncq 4096 in (Drive related)
May 31 15:17:05 Hyperion kernel:          res 40/00:00:7f:06:8c/00:00:09:00:00/40 Emask 0x10 (ATA bus error) (Errors)
May 31 15:17:05 Hyperion kernel: ata2.00: status: { DRDY } (Drive related)
May 31 15:17:05 Hyperion kernel: ata2: hard resetting link (Minor Issues)
May 31 15:17:05 Hyperion kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320) (Drive related)
May 31 15:17:05 Hyperion kernel: ata2.00: configured for UDMA/133 (Drive related)
May 31 15:17:05 Hyperion kernel: ata2: EH complete (Drive related)

 

I found this ( https://lime-technology.com/wiki/index.php/The_Analysis_of_Drive_Issues ) discussion of the issue, but I've checked all the cables and they seem to be working fine (and had been working fine, I didn't change any cables).

 

The error doesn't list what drive/cable/connector is the cause.  Is there any way to map "ata2.00" to an actual drive?  I see some indication it may be with a 6.0 Gbps port.  This mobo has two such ports, and I have an SSD connected to one of them.  Presumably that's the cause?  Should I just swap it to a 3.0 Gbps port?

 

Edit:  Update - I pulled all the cables from that backplane and reinstalled them, and specifically swapped this drive to the other 6Gbps port.  It looked like *maybe* the SATA cable was locked but sitting a bit further out than the other one.  Booted it back up and no errors this time.  I'll keep an eye on it.

 

Also, the Hitachis seem to like this motherboard's LSI controller better than whatever was on the other board.  They're all now running right around 50MBps.  Still not great, but a vast improvement over the 10-20MBps I was getting.  50 might be tolerable until they're EOL.  We'll see.

syslog-2015-05-31.txt.zip

Link to comment

Do you have your BIOS set to AHCI mode for the SATA ports, or IDE-Mode?

Not entirely sure.  Whatever the default is, which I would have assumed was AHCI.  My drives are labeled sdX and not hdX.  I can check when I reboot next.  My SSD runs over 250MB/s on its SATA3 port, which I think means it has to be AHCI, but I'll check.

 

I did flash the LSI to IT mode, the BIOS to rev3, and the IPMI firmware to 1.92, though.

Link to comment

Well, started getting a new error in the syslog recently.  Looks like it's pretty common and maybe harmless, but it's still concerning.  I have noticed any ill effects.  Everything still running perfectly.  Some posts say just to upgrade to v6 and call it a day.  Any thoughts? 

 

Jun  3 03:20:01 Hyperion kernel: mce: [Hardware Error]: Machine check events logged (Errors)
Jun  3 03:42:21 Hyperion kernel: mce: [Hardware Error]: Machine check events logged (Errors)

 

The syslog rolled, so the 6/5/2015 syslog shows the first instance of the error, but the 5/31/2015 shows the rest of the log since boot.

Archive.zip

Link to comment

I've seen those on many systems ... not sure what causes them, but they do in fact seem to be harmless.    As long as everything's working okay, and you don't have any disk errors or parity sync issues, I wouldn't be concerned.

Good to hear, thanks, guys.  I've had two parity check since the hardware upgrade and they both were perfect.  All plugins are working, and everything seems normal.  I'm planning to upgrade to V6 in the next few weeks, so I'll probably just let it hobble along until then.

Link to comment
  • 3 weeks later...
  • 2 weeks later...

Upgraded to 6.0.1 this weekend, and indeed those errors have gone away.  Thanks for the suggestion.

Also, I went back and checked those Hitachi drives.  They're all running at 100-120MB/s or so.  Slower than most of my other drives at ~130-160MB/s.  Unfortunately, I already have all the replacement drives loaded up and pre-cleared.  Now I have a bunch of drives I don't really need!  I never expected it would have been the OS causing the issue!

 

At least everything's working.

Link to comment

Great thread, I'm going to use your research.

 

It's been very hot here the last few days and I failed to check the temperature in the server closet.  My venerable ASUS P5B-VM DO (bought in 2007...) gave up the magic smoke.  Won't even post now. 

 

Are you still happy with the setup?  No regrets on RAM?  What does the Kill-A-Watt test give you?

 

I was looking at the MBD-X10SLM+-F-O, cheaper but the 8 SAS ports on the MBD-X10SL7-F-O won me over.

 

I guess that's good for the canadian economy, once I buy this the canadian dollar is bound to bounce back up.  350$CAN for the CPU dammit!

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...