SG_IO: bad/missing sense data, sb[] ...


Recommended Posts

Hi,

 

I have just upgraded from 5.0-beta12a to RC8, which seemed to be just fine.

The array is started and working fine,

but on the console I keep getting the following error:

SG_IO: bad/missing sense data, sb[] ...

 

I attached a screenshot of the console, and the system log.

 

Running unRAID as VM under ESXi 5.1, with RDM for data drives.

 

- Itamar.

unRAID-5.0rc8-post-upgrade-system-log.txt

unRAID-Upgrade-RC8-SG_IO-Error.png.39a48a031b21211dbf44a778682651ef.png

Link to comment
  • 4 weeks later...

I just went to RC8a and my console keeps getting spammed with the same.  On a hunch, I removed my RDM'ed SSD cache drive and it went away.  I guess it's trying to read sensor data from the SSD which isn't there (i.e. temperature etc.)  It's annoying yes.. but it doesn't appear to log the constant spam anywhere so I guess it's just annoying :)

Link to comment

I'm not sure it's *just* annoying.

Maybe this is an unrelated issue, but during the period of time I was running with RC8, I experienced several "strange directory locks" on random user-share directories.

I have no idea if it's related to this issue though.

 

The phenomenon looked sort-of like:

1. I write to some directory in a user share (copy a file)

2. After the file is written, the parent directory is "locked" - meaning I can't access it over SMB ("Access Denied" or something like this).

3. Locally on unRAID the directory seems fine - I can list it, and all files are there with valid mode and ownership.

4. I think restarting unRAID helped for the session (though I'm not sure if I had to revert to beta12a for the issue to go away..)

5. All this happens randomly - sometimes files are written and nothing bad happens.

 

Nothing relevant appeared on the log (same log I attached to the first post).

Link to comment

Thanks for the heads up - will keep an eye on things and let you know if I experience anything similar.

 

I don't think it's related though, as info out there suggests that the error is produced by an inability to read certain sensors on a drive.  RDM'ing doesn't allow things like temperature sensors etc to talk to unRaid.  In your case, I'd say it's because all your data drives were RDM'ed. 

 

This is supported by the fact that my 8 data/parity drives (on M1015's passed through to the unRaid VM) don't display this issue once the cache drive is removed from the array.

 

I think the core of the problem is the fact that we are RDM'ing drives - I might just add my cache drive to one of the backplanes in my Norco so it's passed through and get rid of it that way.

Link to comment
  • 1 month later...
  • 2 weeks later...

Has anyone figured out the cause of this problem? I'm also running an SSD passed through via RDM (ESXi) to Unraid. I'm curious if this is an error I should be concerned about, or just a nuisance? It makes it nearly impossible to use the Unraid console in ESXi.

 

I'm also on Unraid rc8a and have attached the messages I'm getting.

Screen_Shot_2012-12-29_at_4_23.41_PM.png.35ab4ce2b6811a54b16321036537d75b.png

Link to comment

Has anyone figured out the cause of this problem? I'm also running an SSD passed through via RDM (ESXi) to Unraid. I'm curious if this is an error I should be concerned about, or just a nuisance? It makes it nearly impossible to use the Unraid console in ESXi.

 

I'm also on Unraid rc8a and have attached the messages I'm getting.

The only solution I know of is to pass the controller through to unRAID VM rather than individual drives.  The error is just a nuisance so you could ignore it.  However if you have any spinners that are RDM as well they will not get spun down.
Link to comment

BobPhoenix, thanks for the info. I have my parity drive (a 3TB spinner RDM'd as well). I did this so that I could have the parity drive off of the PCI-E bus (as described in the tuning tips). Any thoughts on doing that?

 

I'm passed all three M1015's through to unraid and all drives on those cards work perfectly. Should I bite the bullet on parity calculation and just use the drive on one of the M1015's w/o RDM?

 

As far as the spin down issue, that means my parity drive will never spin down. I feel like parity would be the most active drive, does that ever spin down normally?

Link to comment

BobPhoenix, thanks for the info. I have my parity drive (a 3TB spinner RDM'd as well). I did this so that I could have the parity drive off of the PCI-E bus (as described in the tuning tips). Any thoughts on doing that?

You want it off a PCI bus not a PCIe bus.  PCIe bus is currently the 2nd fastest access you can get.  Only MB chipset controllers are faster and only if they are of current vintage SataII or SataIII.  Edit: actually that is true if you are using x4 or x8 PCIe slots.  If you had more than 2 drives on a PCIe x1 it might be slower than an RDM'd drive that was on MB or x8 slot.  Edit2: another exception is if you are using a stripped raid setup to simulate your parity drive that would probably work better as well but I believe you still need to pass the raid controller to give you that faster access.

 

I'm passed all three M1015's through to unraid and all drives on those cards work perfectly. Should I bite the bullet on parity calculation and just use the drive on one of the M1015's w/o RDM?
Yes you will get faster access this way than an RDM'd drive.  The only exception would be if you were using a SAS expander like I am.  Then it might be faster using the RDM'd drive.

 

As far as the spin down issue, that means my parity drive will never spin down. I feel like parity would be the most active drive, does that ever spin down normally?

Parity is the least used drive on my unRAID systems since I only write data once but read it many times it stays spun down 95%+ of the time.
Link to comment

BobPhoenix, thank you again for the info, this is extremely helpful. This is also making me rethink my current setup.

 

So if you had the 3xM1015 setup, it sounds like you would keep everything unraid on the passed-through M1015's. I'm running ESXI and I can't pass the mobo controller through to unraid since my datastore's live on SSD's on the two Sata III ports; which leaves only RDM for using the MOBO Sata II ports in Unraid.

 

Is making the change to the configuration as simple as stopping the parity calc, stopping the array, re-arranging the drives, fixing the assignment and starting again? Will I have to re-preclear the parity drive? I've done 3 passes on all 3TB spinners in the system as this point and parity calc is at 30% for the current array configuration without any data-- assume I have no data to lose at this point.

Link to comment

BobPhoenix, thank you again for the info, this is extremely helpful. This is also making me rethink my current setup.

 

So if you had the 3xM1015 setup, it sounds like you would keep everything unraid on the passed-through M1015's. I'm running ESXI and I can't pass the mobo controller through to unraid since my datastore's live on SSD's on the two Sata III ports; which leaves only RDM for using the MOBO Sata II ports in Unraid.

 

Is making the change to the configuration as simple as stopping the parity calc, stopping the array, re-arranging the drives, fixing the assignment and starting again? Will I have to re-preclear the parity drive? I've done 3 passes on all 3TB spinners in the system as this point and parity calc is at 30% for the current array configuration without any data-- assume I have no data to lose at this point.

Yes I would put all array drives on the 3 M1015s.  It should be that simple but you are getting into an area that I have not gone.  I've only RDM'd WHS drives before and yes that is how it worked when I moved them to a SASLP-MV8.  You don't need to pre-clear a parity drive at all except to find a bad drive.  Although I pre-clear every drive I buy including those headed for Windows VMs.  Parity drives don't have a file system on them and don't need to be pre-cleared.  The most you would have to do when moving ONLY the parity drive is a recalculation/rebuild of parity with the rest of the array and you probably won't even need that.  I would still run a non-correcting parity check after the migration even if a rebuild is not needed.
Link to comment
  • 3 years later...

Don't have an answer for you - hopefully someone else will.  I only have one ESXi server left and it will be bare metal unRAID in a month or two.  Maybe less.

 

Is there even a way to determine which drive it's trying to read the data (I'm guess hdparm) from?

Unfortunately I do not have that information.  A linux guru might be needed.
Link to comment
  • 3 months later...

Don't have an answer for you - hopefully someone else will.  I only have one ESXi server left and it will be bare metal unRAID in a month or two.  Maybe less.

 

Is there even a way to determine which drive it's trying to read the data (I'm guess hdparm) from?

 

What is your SCSI controller 0 set to? I accidentally set mine to something other than VMWare Paravirtual, and was getting the same error as stated in this thread. Moving it back to VMWP fixed it.

Link to comment
  • 7 months later...

I've noticed I have this same spam on my unraid setup, I also found after a required reboot after updating to 6.3.3 my cache drive was not assigned... I had to assign ( It didn't stick... and come to think of it I've had to assign it manually twice now after restart.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.