SuperMicro continuous issues! Controller issues...


Recommended Posts

Let me ask you this. Once the M1015 controler comes in I should be able to pop it into the server and rebuild the drive set and have all the data there?

 

Once you assign the drives and start the array, the data should all be there.

 

Okay I will give that a try once it comes in. So I guess I should just send back the new sas2lg controller? I will keep yaw posted as if this fixes the issue.

It was mentioned before. Should I go ahead and swap out the PSU to the Corsiar AX760 that I have or should the Corsair AX860 that is currently in the server should be fine?

Link to comment
  • Replies 71
  • Created
  • Last Reply

Top Posters In This Topic

Did you remove that zip tie holding the SATA cables together like garycase suggested in your build thread?

 

Yes I did, I removed all bread ties.

 

Sorry, I know this is a late reply, but bunching your SATA cables with zip ties is perfectly fine.  SATA is LVD - there's no issues with crosstalk, or even with power interference.  Anyone who tells you anything else is clueless.

Link to comment

My OCD prevents me from having my cabling au-naturale.

 

Can't say I've ever had any problems, but don't know anything about the science of it.

 

What's LVD?

 

As for changing the PSU, I agree with Mr Hexen, only ever change one variable at a time, good scientific practice , which allows you to get to the root of the problem.

Link to comment

Did you remove that zip tie holding the SATA cables together like garycase suggested in your build thread?

 

Yes I did, I removed all bread ties.

 

Sorry, I know this is a late reply, but bunching your SATA cables with zip ties is perfectly fine.  SATA is LVD - there's no issues with crosstalk, or even with power interference.  Anyone who tells you anything else is clueless.

 

 

This is incorrect. Here is a link to a technical article entitled, "Common SATA Signal Integrity Issues": http://blog.asset-intertech.com/test_data_out/2014/06/common-sata-signal-integrity-issues.html

 

 

SATA interference is totally system dependent.

 

Link to comment

Well here is an update

 

Getting this

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957296

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957304

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957312

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957320

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957328

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957336

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957344

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957352

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957360

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957368

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957376

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957384

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957392

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957400

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957408

 

http://lilnetwork.com/download/nas/tower-diagnostics-20150831-0124.zip

Link to comment

Well here is an update

 

Getting this

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957296

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957304

Aug 31 01:23:38 Tower kernel: md: disk6 read error, sector=1678957408

 

SMART reports indicate one drive (WDC WD40EFRX-68WT0N0 WD-WCC4E1992613) has an enormous number of CRC errors (24965), usually indicating a very bad SATA cable, but there's a small chance it's bad or loose power to the drive.

 

One other drive also has a few (WDC WD40EFRX-68WT0N0 WD-WCC4EK8ZSH81) CRC errors (138), keep an eye on it, and if the number continues to increase, replace its SATA cable too.  All other SMART reports look fine.

 

Syslog indicates the first drive with very bad cable is Disk 6 (sdh)!  It's on the SAS card, using the mpt2sas module, and *trying* to report errors with its awful error handler (in my opinion!), singularly unhelpful error reporting.  Issue occurs at Aug 31 00:53:10, it attempts a full reset of the card, and it appears to think the reset was successful, but less than a minute later, reports an UNKNOWN problem with Disk 6, then spews numerous read errors.  Since SMART report shows no indications of bad sectors, I consider these read errors to be spurious, not real, and most likely the SAS card has lost contact with the drive, and it's not responding to the read requests.  Probably needs a reboot.  Does not inspire confidence in the card.

Link to comment

Well Drives 1-8 are on the card and 9-12, parity, cache are on the mobo.

I had unplugged the drive 6 and plugged it back in. I started the server back up and running rebuild.

 

http://lilnetwork.com/download/nas/IMG_20150831_143418.jpg

http://lilnetwork.com/download/nas/IMG_20150831_143404.jpg

width=400http://lilnetwork.com/download/nas/IMG_20150831_143418.jpg[/img]

width=400http://lilnetwork.com/download/nas/IMG_20150831_143404.jpg[/img]

Link to comment

Well I had plugged in the M1050 and so far no errors. But I guess only time will tell if I will have an issue with this card.  :o

 

It is currently rebuilding the data.

Although I do have to say the EST speed is about 40MB/sec quicker than the super-micro card.  ???8);D

For a total of 145MB/sec

 

I was just going to say congrats on getting your server back up and error free then read your update. Sorry to hear you are getting errors. Are you in the US? I have a very stable old Intel board I used unraid with it for a few years and worked like a charm. If you are in the US I can send it to you, just pay for shipping. Maybe having another board to test/try will help?

Link to comment

Well I had plugged in the M1050 and so far no errors. But I guess only time will tell if I will have an issue with this card.  :o

 

It is currently rebuilding the data.

Although I do have to say the EST speed is about 40MB/sec quicker than the super-micro card.  ???8);D

For a total of 145MB/sec

 

I was just going to say congrats on getting your server back up and error free then read your update. Sorry to hear you are getting errors. Are you in the US? I have a very stable old Intel board I used unraid with it for a few years and worked like a charm. If you are in the US I can send it to you, just pay for shipping. Maybe having another board to test/try will help?

 

Yea I am down in Texas

 

Edit (9/01/2015 06:38am CST-6):

The drive rebuild finished and I don't see any errors in my logs. Although I have still lost a good amount of data.  :'(

 

I have attached a diagnostic log:

http://lilnetwork.com/download/nas/tower-diagnostics-20150901-0637.zip

Link to comment

Well I had plugged in the M1050 and so far no errors. But I guess only time will tell if I will have an issue with this card.  :o

 

It is currently rebuilding the data.

Although I do have to say the EST speed is about 40MB/sec quicker than the super-micro card.  ???8);D

For a total of 145MB/sec

 

I was just going to say congrats on getting your server back up and error free then read your update. Sorry to hear you are getting errors. Are you in the US? I have a very stable old Intel board I used unraid with it for a few years and worked like a charm. If you are in the US I can send it to you, just pay for shipping. Maybe having another board to test/try will help?

 

Yea I am down in Texas

 

Edit (9/01/2015 06:38am CST-6):

The drive rebuild finished and I don't see any errors in my logs. Although I have still lost a good amount of data.  :'(

 

I have attached a diagnostic log:

http://lilnetwork.com/download/nas/tower-diagnostics-20150901-0637.zip

 

I don't see any problems, except that was already mentioned about disk 6. I have one disk with 1 CRC error and I got that one error the first day I used it. It never in creased, but your disk 6 has a lot of them. If your re-build is complete with no errors then it was because of that SAS2 card? I'm lucky using 2 of them.

 

Two SAS2 cards

Link to comment

Well I had plugged in the M1050 and so far no errors. But I guess only time will tell if I will have an issue with this card.  :o

 

It is currently rebuilding the data.

Although I do have to say the EST speed is about 40MB/sec quicker than the super-micro card.  ???8);D

For a total of 145MB/sec

 

I was just going to say congrats on getting your server back up and error free then read your update. Sorry to hear you are getting errors. Are you in the US? I have a very stable old Intel board I used unraid with it for a few years and worked like a charm. If you are in the US I can send it to you, just pay for shipping. Maybe having another board to test/try will help?

 

Yea I am down in Texas

 

Edit (9/01/2015 06:38am CST-6):

The drive rebuild finished and I don't see any errors in my logs. Although I have still lost a good amount of data.  :'(

 

I have attached a diagnostic log:

http://lilnetwork.com/download/nas/tower-diagnostics-20150901-0637.zip

 

I don't see any problems, except that was already mentioned about disk 6. I have one disk with 1 CRC error and I got that one error the first day I used it. It never in creased, but your disk 6 has a lot of them. If your re-build is complete with no errors then it was because of that SAS2 card? I'm lucky using 2 of them.

 

Two SAS2 cards

 

No I keep on having issues with the SuperMicro SAS2LP cards. I have put in a M1050 and did a rebuild.

Link to comment

Well I had plugged in the M1050 and so far no errors. But I guess only time will tell if I will have an issue with this card.  :o

 

It is currently rebuilding the data.

Although I do have to say the EST speed is about 40MB/sec quicker than the super-micro card.  ???8);D

For a total of 145MB/sec

 

I was just going to say congrats on getting your server back up and error free then read your update. Sorry to hear you are getting errors. Are you in the US? I have a very stable old Intel board I used unraid with it for a few years and worked like a charm. If you are in the US I can send it to you, just pay for shipping. Maybe having another board to test/try will help?

 

Yea I am down in Texas

 

Edit (9/01/2015 06:38am CST-6):

The drive rebuild finished and I don't see any errors in my logs. Although I have still lost a good amount of data.  :'(

 

I have attached a diagnostic log:

http://lilnetwork.com/download/nas/tower-diagnostics-20150901-0637.zip

 

I don't see any problems, except that was already mentioned about disk 6. I have one disk with 1 CRC error and I got that one error the first day I used it. It never in creased, but your disk 6 has a lot of them. If your re-build is complete with no errors then it was because of that SAS2 card? I'm lucky using 2 of them.

 

Two SAS2 cards

 

No I keep on having issues with the SuperMicro SAS2LP cards. I have put in a M1050 and did a rebuild.

 

So apparently you are hit with the SAS ans SAS2 bug in unraid. Some people have it, most do not. We would hear a lot more on here if the problem was larger.

 

Link to comment

Well I was having an issue with the rebuild at first untill i unplugged the drives and replugged them in. Although I am still not fully comfy with putting data on the server. I've been jacking with this thing for over a year and right when I think I get all the bugs fixed this darn thing gives me issues again. Thankfully my IMPORTANT data is backed up more than once place. but still. I am tired of downloading my DVDs and BlueRays to my computer / compress/format them then move to the server. I have some big ideas that I am wanting to use the server for but I just seem like I can't with the thing always giving me issues. I am starting to miss my Poweredge 1900... granted it didn't have near the storage but I didn't have to mess with it and it just worked.

 

I am still sticking with unRAID don't get me wrong but this is annoying!

 

 

And the card is actually a LSI MegaRAID 9240-8i PCI-E 6Gb RAID Controller IBM M1015

 

here is some of my other posts about almost the same issues.

http://lime-technology.com/forum/index.php?topic=40501

http://lime-technology.com/forum/index.php?topic=35152.0

Link to comment

Well I was having an issue with the rebuild at first untill i unplugged the drives and replugged them in. Although I am still not fully comfy with putting data on the server. I've been jacking with this thing for over a year and right when I think I get all the bugs fixed this darn thing gives me issues again. Thankfully my IMPORTANT data is backed up more than once place. but still. I am tired of downloading my DVDs and BlueRays to my computer / compress/format them then move to the server. I have some big ideas that I am wanting to use the server for but I just seem like I can't with the thing always giving me issues. I am starting to miss my Poweredge 1900... granted it didn't have near the storage but I didn't have to mess with it and it just worked.

 

I am still sticking with unRAID don't get me wrong but this is annoying!

 

 

And the card is actually a LSI MegaRAID 9240-8i PCI-E 6Gb RAID Controller IBM M1015

 

here is some of my other posts about almost the same issues.

http://lime-technology.com/forum/index.php?topic=40501

http://lime-technology.com/forum/index.php?topic=35152.0

 

Out of curiosity, did you ever try changing out the power supply, or is that still a future troubleshooting step?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.