Jump to content

SuperMicro continuous issues! Controller issues...


demonmaestro

Recommended Posts

I keep on having issues with this darn server left and right. It seems to work and I will put data on the drives. Then next thing I know I have 6 drives "Go bad". So I check them via a smart report and they come back good. So I clean the redball and add them back to the set. And do a parity rebuild. Next thing I know half my darn data is missing/corrupted.

 

What the heck do I do to get this server acting correctly so I stop loosing data. I have LOST at least 10TB worth of data.

 

http://lilnetwork.com/download/nas/tower-diagnostics-20150823-0242.zip

Link to comment
  • Replies 71
  • Created
  • Last Reply

I keep on having issues with this darn server left and right. It seems to work and I will put data on the drives. Then next thing I know I have 6 drives "Go bad". So I check them via a smart report and they come back good. So I clean the redball and add them back to the set. And do a parity rebuild. Next thing I know half my darn data is missing/corrupted.

 

What the heck do I do to get this server acting correctly so I stop loosing data. I have LOST at least 10TB worth of data.

 

http://lilnetwork.com/download/nas/tower-diagnostics-20150823-0242.zip

 

Well, I'm no expert, but if you're sure the drives are good, then the problem has got to be either faulty SATA cables or a dodgy controller.  Are the drives all on the same drive controller?

 

If they are and it's onboard then I guess you need a new MB..

Link to comment

Another thing that occurs to me is a RAM related issue as they are always hard to diagnose as they can cause all sorts of random errors.

 

Things to try related to RAM are a long memtest.    There also appears to be 32GB or RAM installed - it might be worth removing some of this stability often improves if you do not have all RAM slots populated.

 

Also worth checking there is not a BIOS update for the motherboard.

Link to comment

Another thing that occurs to me is a RAM related issue as they are always hard to diagnose as they can cause all sorts of random errors.

 

Things to try related to RAM are a long memtest.    There also appears to be 32GB or RAM installed - it might be worth removing some of this stability often improves if you do not have all RAM slots populated.

 

Also worth checking there is not a BIOS update for the motherboard.

 

That was the first thing I have tried when started having these issues in the first place. Besides It shouldn't give me as many as issues as it is due to the fact it is ECC.

Link to comment

Well you've tried replacing the card already, that did not fix it. So your next steps in narrowing this down are replace the mb (potentially bad PCIe slot) or replace the 8087-SATA cables attached to the card.

 

Do you have the latest BIOS for the mainboard as well?

 

Yea all the BIOS has been updated. I have fliped and floped the cables. So I am going to try the card one more time and hope for the best. I really don't want to replace the MoBo due to the cost.

Link to comment
Aug 23 14:56:49 Tower kernel: md: disk8: ATA_OP e3 ioctl error: -5

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:56:49 Tower kernel: program smartctl is using a deprecated SCSI ioctl, please convert it to SG_IO

Aug 23 14:57:00 Tower emhttp: /usr/bin/tail -n 42 -f /var/log/syslog 2>&1

Link to comment

I haven't heard anyone here question the power supply. This is the one server weakness that can be the most puzzling to diagnose.  Can you replace the power supply?

 

Yes I can but the PSU is not the issue at hand due to it is not under any kind of load and the server has been behind a APC Battery all of its life.

 

nasrunning.png

Link to comment

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...