Preclear Error When Using Supermicro AOC-SASLP-MV8


Recommended Posts

those same capabilities are used to determine drive temperature and tospin it up and down.

It may work fine otherwise.

There is no issue (as far as I can tell..) with using the AOC-SASLP-MV8 in unraid. Even though preclear will not run on a drive attatched to this card, unraid formatted the drive - added it to the array - it reports temp and has no problem with spin down/up. The problem only shows up with preclear. I can only conclude that preclear needs / access's smart commands/data which unraid does not... ?

Joe helped me determine that the locking issues I was experiencing with WD EARS drives was due an issue with the preclear script addressing sectors that did not exist, not with the drives themselves, the SASLP cards, or any other hardware.  Joe will likely be releasing preclear 1.8 soon which fixes this issue.

 

So before you spend a bunch of money on new hardware, you might want to wait for preclear 1.8 and see if that fixes the issue.

My problem (I think spencer785 too) was preclearing ANY disk on this port. I can not say for sure if the failure to execute preclear problem was the port or the drive (I am using the new seagate DL2000) - I can test for this later - I have a new hitachi being shipped now, I shall try to to run preclear on this drive from the same port and see what happens..

Link to comment

Joe helped me determine that the locking issues I was experiencing with WD EARS drives was due an issue with the preclear script addressing sectors that did not exist, not with the drives themselves, the SASLP cards, or any other hardware.  Joe will likely be releasing preclear 1.8 soon which fixes this issue.

 

So before you spend a bunch of money on new hardware, you might want to wait for preclear 1.8 and see if that fixes the issue.

My problem (I think spencer785 too) was preclearing ANY disk on this port. I can not say for sure if the failure to execute preclear problem was the port or the drive (I am using the new seagate DL2000) - I can test for this later - I have a new hitachi being shipped now, I shall try to to run preclear on this drive from the same port and see what happens..

 

I think it stands to reason that ANY drive might lock up if you try to address non-existent sectors on it.  I just happened to be using all WD EARS in my testing.

Link to comment
  • 3 weeks later...

I have seen this exact issue today, after preclearing 5 of my new drives using the MB sata ports successfully, I moved the drives to the AOC-SASLP supermicro ports, I got these errors as well but preclear seemed to have started/finished ok. I used the -n option so it skipped the pre and post read however.

Link to comment

It's happening to me as well.  Except these are precleared drives and I was just running preclear_disk.sh -t to make sure they all are precleared.  Dang I started one again, now I'll have to wait while it preclears again I guess.  :(

I switched data cables and power leads several times before I searched and found this thread.  I'm relieved it's nothing more serious.  

 

EDIT:  Looks like I stopped it before it wrote to the disk so it didn't start preclearing again.

This is with WD20EARS if that's useful to know.  Using unRAID 4.7.

Link to comment

the more i read about these cryptic errors the more it seems like it's related to the support of this card in the kernel... if you look around on some other forums, many people have these types of issues with this particular card.

 

Does the 5.0beta use a newer linux kernel ? Wondering if the mvsas driver is updated relative to v4.7.

 

Is everyone that have this issue on this thread running unRaid 4.7 or anyone running 5.0beta ?

Link to comment

I can confirm that I have the same issue with "A mandatory SMART command failed" with the following set up.

 

unRaid OS = v4.7

Motherboard = Gigabyte GA-880GMA-UD2H

Controllers = AOC-SASLP-MV8 x 2

HDDs = Hitachi 5K3000 2TB

 

See my post here or my screenshots below:

 

Error SMART Status command failed

...

Register values returned from SMART Status command are:

ST = 0x50

ERR=0x00

NS=0x08

SC=0xc0

CL=0x87

CH=0xe0

SEL=0x40

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.$

 

I then have to type 'Yes' a second time in order to kick of a pre-clear.

 

SmartError_thumb.jpg

 

The preclear SMART error only occurs when the HDDs are connect to the SASLPs (i.e. not when connected directly to the motherboard SATA connectors)

 

regards

 

Alex

 

Link to comment

I have had issues running preclears from the SASLP.  I don't do it anymore.  I always preclear from either a motherboard port or other addon controller (I have an old PCI controller that I use if the server is full).  The SASLP works fine once the drives are precleared.

 

Here is a link to the original post on this problem.

 

SMART error while preclearing on SASLP

 

And here is a followup where I included this "-T permissive" option.  It does fix the symptom you are seeing, but I am not convinced it is a good thing to do.  The problem with the crash was finally isolated to a drive who's smart system got corrupted (see below).  That disk was RMA'ed.

 

Permissive option

 

Here was the drive that caused all my trouble.  That "Runtime_Bad_Block" attribute, I believe, was an indicator of the drive's failure.  Not sure what type of "block" it is talking about, as there are no media errors. I assume it was ROM or RAM on the drive itself that was failing.  I think it was a cooincidence that this drive failed in this unusual way that was precleared on the SASLP, but wanted to get the information out there so other users are on the lookout for similar symptoms.

 

Failed drive - unusual smart report

 

 

Link to comment
  • 1 month later...

Well, maybe some more insight into this.  I precleared two drives in the past two days.  Both were precleared on a SASLP.  The first one precleared fine, with no errors.  At the time, there were no HDD's within the protected array assigned to a motherboard port (a cache drive was on the last MB port).  All drives within the protected array were on the SASLP.  However, I added the Hitachi drive to a motherboard port and let unRAID update parity.  I then tried preclearing a Samsung drive and got the SMART error that is being discussed here.

 

I ignored the errors and precleared anyway with the -D option.  So far it seems to be working okay (about 7 hours into it).

 

The only difference in my system between preclearing my Hitachi and my Samsung was that I added a drive to the protected array on a MB port.  So this would tell me that perhaps this error is either a) drive specific, or b) perhaps the SASLP can work without error, so long as you don't use the MB ports.  Once this preclear is done, I will try preclear on it again to test for the error, then reassign my Hitachi drive off the MB port and onto a SASLP port.  This will put all the drives withing the protective array entirely back on the SASLP.  I will then run preclear on the Samsung drive to see if the error remains.

 

If the error goes away, it would point to it being some type of conflict caused by mixing motherboard and SASLP ports within the protected array.  And that problem sounds like something that could be fixed in software.

Link to comment

Well, now I think the problem must be with either a particular drive or perhaps certain models of drives.  Here is what I did.

 

After preclear successfully completed on my Samsung 204 with JP1 firmware, I tried running a preclear again.  It gave me the same error message.  I moved all my HDD's off the MB, so only the SASLP was used.  The drive still gave the same error.  I then tried preclearing the Hitachi 5k3000 and got no error.

 

I then put four drives back on the MB ports.  And put the 204 back on the same port as before.  Got the error on preclear.  I then put the Hitachi in the same bay (so same port/card).  I ran preclear on it and got no errors.  I then put the Samsung on a MB port, and again, got the same preclear error.

 

So the preclear error clearly follows the drive to any port.  The Hitachi drive, likewise, can go to the same ports and preclear without the error.

 

In my mind, this clearly means it is drive specific (or model specific if all 204's exhibit this behavior).

 

There are two hardware differences I notice comparing all the drives I've recently precleared and the Samsung 204.  First, the Samsung reports it uses ATA-8-ACS revision 6.  All other drives that haven't given me the error report ATA-8-ACS revision 4.

 

The only other thing I notice is that offline status for all the drives that do not give an error is (0x84).  On the 204, that status is (0x0).

 

Other than those two items, nothing stands out.

 

I used smartctl - o on /dev/sdx to turn on the offline status (0x80).  There is no way to manually make it 0x84 (I don't think).  Still encountered the error.  So it doesn't look like offline testing being set to off causes the problem.

Link to comment

I guess we can scrap ATA revision differences too.  I just noticed that Replay #5 on this thread got the error with a drive reporting ATA-8-ACS revision 4, which is the revision present on all the drives that don't cause an error for me.

 

I have a second Samsung 204 that I have not precleared.  As soon as I get a chance to flash the firmware, I'll try it to see if I get the same error.  I know I don't get it with Hitachi 5k3000's, Hitachi 7k2000's, and WD20EARS.  I have precleared all these types of drives on the SASLP in unRAID 4.7 using either preclear 1.9 or 1.11. No errors.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.