ioctl errors

abeta · September 24, 2013

I'm seeing a lot of these "errors". A quick search through the forums seeems to be mixed as to whether it is a problem or not.

I'm running on unRaid 5.0 with SAS drives and the "errors" seem to occur when it is spinning down. Is this normal since they're SAS drives and basically a false positive or is there something else going on?

Sep 24 01:29:31 Alpha kernel: mdcmd (81): spindown 3 (Routine)

Sep 24 01:29:31 Alpha kernel: md: disk3: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 02:27:12 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 02:27:12 Alpha kernel: mdcmd (82): spindown 6 (Routine)

Sep 24 02:27:12 Alpha kernel: md: disk6: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 03:30:42 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 03:30:42 Alpha kernel: mdcmd (83): spindown 3 (Routine)

Sep 24 03:30:42 Alpha kernel: md: disk3: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 04:51:52 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 04:51:52 Alpha kernel: mdcmd (84): spindown 3 (Routine)

Sep 24 04:51:52 Alpha kernel: md: disk3: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 04:52:02 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 04:52:02 Alpha kernel: mdcmd (85): spindown 4 (Routine)

Sep 24 04:52:02 Alpha kernel: md: disk4: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 04:52:22 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 04:52:22 Alpha kernel: mdcmd (86): spindown 6 (Routine)

Sep 24 04:52:22 Alpha kernel: md: disk6: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 10:02:14 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 10:02:14 Alpha kernel: mdcmd (87): spindown 4 (Routine)

Sep 24 10:02:14 Alpha kernel: md: disk4: ATA_OP e0 ioctl error: -5 (Errors)

Sep 24 10:19:14 Alpha emhttp: mdcmd: write: Input/output error (Errors)

Sep 24 10:19:14 Alpha kernel: mdcmd (88): spindown 1 (Routine)

Sep 24 10:19:14 Alpha kernel: md: disk1: ATA_OP e0 ioctl error: -5 (Errors)

dgaschk · September 24, 2013

Attach a syslog.

madburg · September 24, 2013

@abeta

run the following command against one of your SAS drives while in another session viewing the syslog, does the same message get generated

hdparm -y /dev/sdX (X being the letter assigned to the sas drive you have chosen to spin down manually via this command)

abeta · September 24, 2013

/dev/sdd:

issuing standby command

SG_IO: bad/missing sense data, sb[]: 72 05 20 00 00 00 00 1c 02 06 00 00 cf 00 00 00 03 02 00 01 80 0e 00 00 00 00 00 00 00 00 00 00

HDIO_DRIVE_CMD(standby) failed: Input/output error

This is the error on the terminal console. I don't see it in the syslog.

Syslog attached. I'm aware that I need to remove Simple Features as it isn't fully supported but I'm in the midst of a very long copy and haven't gotten to it yet .

syslog-2013-09-24.zip

madburg · September 24, 2013

Check the hdparm version you are running

hdparm -?

Its not SimpleFeatures

abeta · September 24, 2013

root@Alpha:~# hdparm -V

hdparm v9.37

root@Alpha:~#

Looks like -Y is to try to put it to sleep. Lemme try now.

Console output:

/dev/sdd:

issuing sleep command

SG_IO: bad/missing sense data, sb[]: 72 05 20 00 00 00 00 1c 02 06 00 00 cf 00 00 00 03 02 00 01 80 0e 00 00 00 00 00 00 00 00 00 00

HDIO_DRIVE_CMD(sleep) failed: Input/output error

root@Alpha:~#

Nothing in the syslog to match.

madburg · September 24, 2013

Copy this package (see attachment) to your flash drive, remove the txt extension from it, go to where you copied it to in a terminal, issue this command to install version 9.43

upgradepkg --install-new hdparm-9.43-i486-1_pourko.txz

re-run the command hdparm -y /dev/sdX (against the same drive you tested before)

post the output after updating to this version of hdparm

Dont use "Y" its "y" you want to put into standby not sleep, many times you cannot get the drive back up otherwise. unRAID spin down drive IS standby.

Separate to this, are you getting temp reading in the main unRAID page for your drives?

hdparm-9.43-i486-1_pourko.txz.txt

madburg · September 24, 2013

This is true, BUT the root cause is hdparm, manually issuing hdparm commands against these drives, is producing an error.

His LSI controllers are supported, he is running (nice and expensive) SAS HD's which I don't have experience with in unRAID, so it could be the drives don't support standby, cables which I don't think so as each and every one of those drives via syslog the driver reports ioctl error when unRAID issues spinup or spin down, one or two badly seated cables maybe but ALL not likely, or unsupported cables for these controller cards to the sas drives. OR hdparm which made changes in regards to old SCSI pass-through ioctl.

So its easy and quick to rule this out by him updating hdparm (thanks to you offering this complied version) and see if that changes anything, as the others are more difficult and could require $ to figure out.

Secondly, interest if he is getting temps for the drives on the main page (even though SF is loaded, skewing if the OEM main page would, but none the less)

P.S. you owe me the latest smartmontools compiled

abeta · September 24, 2013

root@Alpha:/boot# ls hdparm-9.43-i486-1_pourko.txz

hdparm-9.43-i486-1_pourko.txz*

root@Alpha:/boot# upgradepkg -install-new hdparm-9.43-i486-1_pourko.txz

Cannot install -install-new: file not found

Error: there is no installed package named hdparm-9.43-i486-1_pourko.

(looking for /var/log/packages/hdparm-9.43-i486-1_pourko)

root@Alpha:/boot#

No temperature readings but I thought that might be normal as I've tried SCSI drives before along time ago and I don't remember temps.

Looks like the file is there?

ETA: If I'm reading it correctly from Pourko...it's expected? .

abeta · September 24, 2013

Alpha login: root

Linux 3.9.6p-unRAID.

root@Alpha:~# cd /boot

root@Alpha:/boot# installpkg hdparm-9.43-i486-1_pourko.txz

Verifying package hdparm-9.43-i486-1_pourko.txz.

Installing package hdparm-9.43-i486-1_pourko.txz:

PACKAGE DESCRIPTION:

# hdparm (read/set hard drive parameters)

#

# hdparm provides a command line interface to various hard disk ioctls

# supported by the Linux ATA/IDE device driver subsystem. This may be

# required to enable higher-performing disk modes.

#

# hdparm was written by Mark Lord.

#

Executing install script for hdparm-9.43-i486-1_pourko.txz.

Package hdparm-9.43-i486-1_pourko.txz installed.

root@Alpha:/boot# hdparm -y /dev/sdd

/dev/sdd:

issuing standby command

SG_IO: bad/missing sense data, sb[]: 72 05 20 00 00 00 00 1c 02 06 00 00 cf 00 00 00 03 02 00 01 80 0e 00 00 00 00 00 00 00 00 00 00

HDIO_DRIVE_CMD(standby) failed: Input/output error

root@Alpha:/boot#

madburg · September 25, 2013

That's not good news for you. I am interested in your SAS setup, let's see if we can find whats the issue(s).

Can you post the output of these commands:

hdparm -I /dev/sdX

hdparm -H /dev/sdX

smartctl --all /dev/sdX

These will show what data can be retrieved from the drives via your setup. As well as what its choking on retrieving.

Can you share what type of enclosure the drives are housed in (model), what LSI controllers (model), what exact cables from the LSI controllers to the drives (model) or to and enclosure and from the enclosure to the drives your using as well.

abeta · September 25, 2013

It's a Greenleaf Technology Cleverbox 16. I'll shoot them a note and they can comment on all of the inner workings I'm sure. The rest of the info that you requested is posted below.

ETA: http://greenleaf-technology.com/types-of-servers/silent-servers/

That might be pictures of my 16 .

ABSOLUTELY PLEASED WITH GREENLEAF. Just wanted to get it down on the thread and then ask Lime Tech about it

Alpha login: root

Linux 3.9.6p-unRAID.

root@Alpha:~# hdparm -I /dev/sdd

/dev/sdd:

SG_IO: bad/missing sense data, sb[]: 72 05 20 00 00 00 00 1c 02 06 00 00 cf 00 00 00 03 02 00 01 80 0e 00 00 00 00 00 00 00 00 00 00

HDIO_DRIVE_CMD(identify) failed: Input/output error

root@Alpha:~# hdparm -H /dev/sdd

/dev/sdd:

SG_IO: bad/missing sense data, sb[]: 72 05 20 00 00 00 00 1c 02 06 00 00 cf 00 00 00 03 02 00 01 80 0e 00 00 00 00 00 00 00 00 00 00

HDIO_DRIVE_CMD(hitachisensecondition) failed: Input/output error

root@Alpha:~# smartctl --all /dev/sdd

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

Device: SEAGATE ST33000650SS Version: 0003

Serial number: Z292AZWL000092350KUC

Device type: disk

Transport protocol: SAS

Local Time is: Wed Sep 25 17:32:18 2013 EDT

Device supports SMART and is Enabled

Temperature Warning Enabled

SMART Health Status: OK

Current Drive Temperature: 48 C

Drive Trip Temperature: 68 C

Manufactured in week 13 of year 2012

Specified cycle count over device lifetime: 10000

Accumulated start-stop cycles: 287

Specified load-unload count over device lifetime: 300000

Accumulated load-unload cycles: 287

Elements in grown defect list: 0

Vendor (Seagate) cache information

Blocks sent to initiator = 730488606

Blocks received from initiator = 778098224

Blocks read from cache and sent to initiator = 76648815

Number of read and write commands whose size <= segment size = 3341096

Number of read and write commands whose size > segment size = 19769

Vendor (Seagate/Hitachi) factory information

number of hours powered up = 2832.42

number of minutes until next internal SMART test = 2

Error counter log:

Errors Corrected by Total Correction Gigabytes Total

ECC rereads/ errors algorithm processed uncorrected

fast | delayed rewrites corrected invocations [10^9 bytes] errors

read: 3761518176 0 0 3761518176 0 4157.876 0

write: 0 0 0 0 0 398.608 0

Non-medium error count: 168

[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']

No self-tests have been logged

Long (extended) Self Test duration: 27600 seconds [460.0 minutes]

root@Alpha:~#

madburg · September 25, 2013

That sites pictures load @ss slow, they supplied you with the SAS drives and cables to them? or you did that portion?

I would not think they would have supplied you with a setup that does not provide drive temps and complete integration of drive info via unRAID

P.S. Your drives are running (based on they are all the same and in that chassis) very hot 48 C, may want to look into that.

abeta · September 25, 2013

They supplied the Norco cages and tested one of the SAS drives. I put in the SAS drives when I got it there. Everything was trayless to the Norco cages as they support SAS/SATA.

I could probably use some ventilation in that corner of the office .

madburg · September 25, 2013

The LSI controllers and most cages support SAS but I don't have experience with unRAID and SAS drives. I have seen bad/missing sense data once befoer when cable wasnt fully seating, but in your case its the same for all the drives, and there all SAS. SAS cables are different than SATA so its either the cables or some type of missing (for a lack of better words) component in unRAID to support either the SAS protocol entirely (something to that affect).

When you say they tested one of the SAS drive, do you mean precleared it or some other more tests? Wondering if unRAID was actually loaded to see what it thought of the drive (hence it would have been noticed no drive temp and syslog errors for that one drive)?

You really should get those drives running cooler

P.S. Did you choose/pick those particular drives?

abeta · September 25, 2013

I dont have much experience with unRAID and SAS drives either . The one time I tried SCSI drives with a 3Ware controller I recalled it was a very similar experience. No drive temps and a variety of errors similar to this. I didn't get much help from Lime Technology back then as I don't think it was really supported and I was just playing around with it so I never pursued it.

Precleared and burned in.

The system is actually running ESXi with unRAID and the LSI as a pass-through.

A good friend of mine is the GM of a reseller/VAR that my company deals with. I got it at a huge discount for services rendered and I his guarantee of service in the event of drive failure. I hate doing RMAs for drives in general so that was the real benefit to me. I think it cost me just a little more than SATA drives would have run me with no hassles so I didn't really care what brand/make/etc it was as long as he didn't care.

madburg · September 25, 2013

Thats all cool, nothing wrong with trying something new as long as you don't mind it might not be 100% in the end. It could also be the backplanes possible as well.

If you have a spare SAS drive what you could do is add it to a different box (shutdown your unRAID server and pull one of the LSI controllers for a moment if need be), running some other OS and see if hdparm gives you back complete results. Or pull the unraid usb key and all the drive and insert a Spare SAS drive, load say windows, and see what hdparm comes back with. Can help you rule out if its an OS/component issue, or controller/cable/backplane, etc.. you get the idea.

I couldn't find anything about that particular HD model and hdparm issues on the net. So I don't believe its hdparm and your particular drive.

abeta · September 25, 2013

Ack. I'm not really concerned at the moment as long as no one here was screaming OMG everything is about to blow up I think its probably normal or as close to normal. I'm currently migrating my old unRAID machine to this so I've got a backup if things start acting up.

madburg · September 26, 2013

I also see why your not getting drive temps, but was looking for Toms comments in a post to properly explain why in your situation (with SAS drives).

Quote from Tom

When you load the webGui page we want to go get the disk temperatures so that we can display them. For this, emhttp, uses smartctl like this:

/usr/sbin/smartctl -n standby -s on -A /dev/sdX

where "sdX" is the linux device identifier. The "-n standby" tells smartctl to exit and not access any smart attributes if it thinks the device is in standby mode, ie, spun down. This is because with most drives, reading the smart attributes causes the drive to spin up. If the temperate can't be read then webGui displays an "*" for the temperature.

In terms of which attribute to use, emhttp looks first for "Airflow_Temperature_Cel" and if found, uses that setting. If not found then it looks for "Temperature_Celsius". If neither found then it displays "*" for temperature.

This is the only use of "smartctl" in stock unraid.

But smartctl for a SAS drive reports temps as "Current Drive Temperature:"

So you could ask Tom to add checking for this third variation.

Secondly, spin up/down

Quote from Tom

the unraid-driver issues spin-ups in order to implement spinup-groups, emhttp uses smartctl to spin drives down

The code to spindown a drive varies:

- for cache drive, the 'hdparm -y /dev/sdX" command is used

- for array drives, the unraid driver sends "ATA_OP_STANDBYNOW1" to the drive (which is supported by all linux disk drivers that we use).

The code to spinup a drive also varies:

- for cache drive, either I/O causes it to spin up, or an explicit 'hdparm -S0 /dev/sdX' command is used

- for array drives, either I/O causes it to spin up, or the unraid driver sends "ATA_OP_SETIDLE1" to the drive (also supported by all linux disk drivers that we use). It does this to spun-down disks of the same spin-up group as the disk with I/O occurring.

So whether or not you are using a cache drive its important to have hdparm compliant with your HDs', which you can see its not. Its also pretty clear in your situation that the unraid driver calls of "ATA_OP_STANDBYNOW1" and "ATA_OP_SETIDLE1" are failing as well (syslog entries) so this is a Tom thing to let you know if its a linux disk driver thing... and what he could possible do (have you test something) OR its a problem with controller/cable/backplane, which you will have to test (quite easy especially if you have a spare SAS Drive)

So in the end the drive temp could be an easy addition by Tom, you would benefit knowing your HD's are to HOT ::)

Secondly, you're not spinning drives down so (two things) 1) you might as well turn that off for now (so attempts are not being made and syslog entries logged about it, as there is a background thread that wakes up every 5 seconds to see if it's time to spin down a drive) 2) That sucks as it is a key point for unRAID to keep your drives spun down, but not everyone cares... thats something for you to decide.

You mention "I think its probably normal or as close to normal", I personally don't believe so, we started back in the beta 5.0 days with no temps and no spindown/up with anything LSI controller based and things were added/change to get that working, so here you are running unraid 5.0 like pre beta 7, in your setup.

Not trying to rain on your parade, I welcome the change(s) that would need to be made to unRAID to support SAS drives.

Looking around it does look like hdparm support SAS (SCSI)

"It can't be used for SCSI hard disk though"

SAS controllers tunnel ATA (for a lack of better words) so our SATA drives are fine.

There is a sdparm utility ("An utility similar to hdparm but for SCSI devices") that can spin up/down SAS drives thought, and this is what would probably be required...

prostuff1 · September 26, 2013

Sorry for getting into comment here late.

The server is an ESXi build using one of the newer Supermicro X10 boards. It has 2 of the M1015 flashed and passed through to unRAID. There are 15 total slots available via the 3 norco cages within unRAID. The top bay is where the ESXi datastore drive resides.

A couple of the drives were provided and that is what was used to set the system up.

I did not see the spinup/spindown issue when testing because I was using mostly SATA drives to burn in the drive cages and flash drive I use for testing has spindown disabled.

It looks like madburg, et al. beat me to the punch, and they had more insight and suggestions and I could have provided.

I think at this point it is really up to Tom to see if there is a possibility to better support SAS drives.

abeta · September 26, 2013

I've emailed Tom this thread link and asked him if he wanted to test something, etc. I'll set the timeouts to a very large value when I get home tonight.

Spindown is a feature I really like so hopefully it'll get fixed/resolved to my liking .

Thanks everyone. If there's other things you want me to show/display feel free to let me know. Appreciate all of the help!

prostuff1 · September 26, 2013

The only other thing I can add is that Tom might look to including sdparm. I have no idea how hard this might be or if it will even work but it is the counterpart to hdparm but for SCSI drives.

madburg · September 26, 2013

@abeta

1) turn spin down off, not to longer value, otherwise you will have the 5 second polling. Until you have luck with Tom testing anything.

2) You should flash you your LSI controller bios (long story for this post), it is incorrect not to and via the LSI configuration you can make adjustment so it post fasts, and the ports will be enabled ahead of time, your syslog shows the behaviour of this.

@prostuff

Waste of a drive and slot to run esxi from a drive, recommend running from usb and not losing a slot.

Theres no fundamental issue of the system as a whole from what I can see just no full SAS support by unRAID is all.

sdparm is no where near a counterpart, etc.. of hdparm ist just similar and would give the ability to spin up/down drives, adding the utility itself to unRAID is simple but coding the detection for which drives are and are not SAS and executing different commands based on if it a SAS drive or not is not (getting it right). Also have to keep in mind that sdparm would only be used for a cache drive, and the unraid md driver would need to make SCSI calls not ATA calls to the SAS drives to accomplish what Tom does today to implement spin groups and spinning down/up drives.

This all goes back to what Purko stated several times, that this may not be the best thing. I understand both their points. Its up to Tom whether he would entertain SAS drive support, but it will never be overnight so keep that in mind. The easiest is getting the drive temps and could be done overnight and a start.

prostuff1 · September 26, 2013

@prostuff

Waste of a drive and slot to run esxi from a drive, recommend running from usb and not losing a slot.

ESXi is installed to a USB drive in the server. The top slot in the server is for the datastore drive in ESXi where all the VM's are housed.

sdparm is no where near a counterpart, etc.. of hdparm ist just similar and would give the ability to spin up/down drives, adding the utility itself to unRAID is simple but coding the detection for which drives are and are not SAS and executing different commands based on if it a SAS drive or not is not (getting it right). Also have to keep in mind that sdparm would only be used for a cache drive, and the unraid md driver would need to make SCSI calls not ATA calls to the SAS drives to accomplish what Tom does today to implement spin groups and spinning down/up drives.

I fully understand the programming side of things and that it would take some time to figure out properly. I don't expect it to be fixed/implemented over night but it would be a good thing to get on the list.

madburg · September 26, 2013

@prostuff

Waste of a drive and slot to run esxi from a drive, recommend running from usb and not losing a slot.

ESXi is installed to a USB drive in the server. The top slot in the server is for the datastore drive in ESXi where all the VM's are housed.

My misunderstanding, sorry, got it now.

sdparm is no where near a counterpart, etc.. of hdparm ist just similar and would give the ability to spin up/down drives, adding the utility itself to unRAID is simple but coding the detection for which drives are and are not SAS and executing different commands based on if it a SAS drive or not is not (getting it right). Also have to keep in mind that sdparm would only be used for a cache drive, and the unraid md driver would need to make SCSI calls not ATA calls to the SAS drives to accomplish what Tom does today to implement spin groups and spinning down/up drives.

I fully understand the programming side of things and that it would take some time to figure out properly. I don't expect it to be fixed/implemented over night but it would be a good thing to get on the list.

It would be a welcomed addition

ioctl errors

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation