6.9.x, LSI Controllers & Ironwolf Disks Disabling - Summary & Fix

Cessquill · March 11, 2021

NOTE: There's a TL;DR section at the end of this post with required steps

People with specific Seagate Ironwolf disks on LSI controllers have been having issues with Unraid 6.9.0 and 6.9.1. Typically when spinning up the drive could drop off the system. Getting it back on would require checking, unassigning, reassigning and rebuilding its contents (about 24 hours). It happened to me three times in a week across two of my four affected drives.

The drive in question is the 8TB Ironwolf ST8000VN004, although 10TB has been mentioned, so it may affect several.

There have been various comments and suggestions over the threads, and it appears that there is a workaround solution. The workaround is reversible, so if an official fix comes along you can revert your settings back. This thread is here to consolidate the great advice given by @TDD, @SimonF, @JorgeB and others to hopefully make it easier for people to follow.

This thread is also here to hopefully provide a central place for those with the same hardware combo to track developments.

NOTE: Carry out these steps at your own risk. Whilst I will list each step I did and it's all possible within Unraid, it's your data. Read through, and only carry anything out if you feel comfortable. I'm far from an expert - I'm just consolidating valuable information scattered - if this is doing more harm than good, or is repeated elsewhere, then close this off.

The solution involves making changes to the settings of the Ironwolf disk. This is done by running some Seagate command line utilities (SeaChest) explained by @TDD here

The changes we will be making are

Disable EPC
Disable Low Current Spinup (not confirmed if this is required)

The Seagate utilities refer to disks slightly differently than Unraid, but there is a way to translate one to the other, explained by @SimonF here

I have carried out these steps and it looks to have solved the issue for me. I've therefore listed them below in case it helps anybody. It is nowhere near as long-winded as it looks - I've just listed literally every step.

Note that I am not really a Linux person, so getting the Seagate utilities onto Unraid might look like a right kludge. If there's a better way, let me know. All work is carried out on a Windows machine. I use Notepad to help me prepare commands beforehand, I can construct each command first, then copy and paste it into the terminal.

If you have the option, make these changes before upgrading Unraid...

Part 1: Identify the disk(s) you need to work on

EDIT: See the end of this part for an alternate method of identifying the disks
1. Go down your drives list on the Unraid main tab. Note down the part in brackets next to any relevant disk (eg, sdg, sdaa, sdac, sdad)
2. Open up a Terminal window from the header bar in Unraid
3. Type the following command and press enter. This will give you a list of all drives with their sg and sd reference

sg_map

4. Note down the sg reference of each drive you identified in step 1 (eg, sdg=sg6, sdaa=sg26, etc.)

image.png.70d27cd49926e662e6bb5b7e77c64336.png

There is a second way to get the disk references which you may prefer. It uses SeaChest, so needs carrying out after Part 2 (below). @TDD explains it in this post here...

Part 2: Get SeaChest onto Unraid
NOTE: I copied SeaChest onto my Flash drive, and then into the tmp folder. There's probably a better way of doing this

EDIT: Since writing this the zip file to download has changed its structure, I've updated the instructions to match the new download.
5. Open your flash drive from Windows (eg \\tower\flash), create a folder called "seachest" and enter it
6. Go to https://www.seagate.com/gb/en/support/software/seachest/ and download "SeaChest Utilities"
7. Open the downloaded zip file and navigate to Linux\Lin64\ubuntu-20.04_x86_64\ (when this guide was written, it was just "Linux\Lin64". The naming of the ubuntu folder may change in future downloads)
8. Copy all files from there to the seachest folder on your flash drive

Now we need to move the seachest folder to /tmp. I used mc, but many will just copy over with a command. The rest of this part takes place in the Terminal window opened in step 2...

9. Open Midnight Commander by typing "mc"
10. Using arrows and enter, click the ".." entry on the left side
11. Using arrows and enter, click the "/boot" folder
12. Tab to switch to the right panel, use arrows and enter to click the ".."
13. Using arrows and enter, click the "/tmp" folder
14. Tab back to the left panel and press F6 and enter to move the seachest folder into tmp
15. F10 to exit Midnight Commander

Finally, we need to change to the seachest folder on /tmp and make these utilities executable...
16. Enter the following commands...

cd /tmp/seachest

...to change to your new seachest folder, and...

chmod +x SeaChest_*

...to make the files executable.

Part 3: Making the changes to your Seagate drive(s)

EDIT: When this guide was written, there was what looked like a version number at the end of each file, represented by XXXX below. Now each file has "_x86_64-linux-gnu" so where it mentions XXXX you need to replace with that.

This is all done in the Terminal window. The commands here have two things that may be different on your setup - the version of SeaChest downloaded (XXXX) and the drive you're working on (YY). This is where Notepad comes in handy - plan out all required commands first

17. Get the info about a drive...

./SeaChest_Info_XXXX -d /dev/sgYY -i

...in my case (as an example) "SeaChest_Info_150_11923_64 -d /dev/sg6 -i"

You should notice that EPC has "enabled" next to it and Low Current Spinup is enabled

18. Disable EPC...

./SeaChest_PowerControl_XXXX -d /dev/sgYY --EPCfeature disable

...for example "SeaChest_PowerControl_1100_11923_64 -d /dev/sg6 --EPCfeature disable"

19. Repeat step 17 to confirm EPC is now disabled
20. Repeat steps 17-19 for any other disks you need to set

21. Disable Low Current Spinup...:

./SeaChest_Configure_XXXX -d /dev/sgYY --lowCurrentSpinup disable

...for example "SeaChest_Configure_1170_11923_64 -d /dev/sg6 --lowCurrentSpinup disable"
It is not possible to check this without rebooting, but if you do not get any errors it's likely to be fine.
22. Repeat step 21 for any other disks

You should now be good to go. Once this was done (took about 15 minutes) I rebooted and then upgraded from 6.8.3 to 6.9.1. It's been fine since when before I would get a drive drop off every few days. Make sure you have a full backup of 6.8.3, and don't make too many system changes for a while in case you need to roll back.

Seachest will be removed when you reboot the system (as it's in /tmp). If you want to retain it on your boot drive, Copy to /tmp instead of moving it. You will need to copy it off /boot to run it each time, as you need to make it executable.

Completely fine if you want to hold off for an official fix. I'm not so sure it will be a software fix though, since it affects these specific drives only. It may be a firmware update for the drive, which may just make similar changes to above.

As an afterthought, looking through these Seagate utilities, it might be possible to write a user script to completely automate this. Another alternative is to boot onto a linux USB and run it outside of Unraid (would be more difficult to identify drives).

***********************************************

TL;DR - Just the Steps

I've had to do this several times myself and wanted somewhere to just get all the commands I'll need...

Get all /dev/sgYY numbers from list (compared to dashboard disk assignments)...

sg_map

Download seachest from https://www.seagate.com/gb/en/support/software/seachest/

Extract and copy seachest folder to /tmp

Change to seachest and make files executable...

cd /tmp/seachest
chmod +x SeaChest_*

For each drive you need to change (XXXX is suffix in seachest files, YY is number obtained from above)...

./SeaChest_Info_XXXX -d /dev/sgYY -i
./SeaChest_PowerControl_XXXX -d /dev/sgYY --EPCfeature disable
./SeaChest_Configure_XXXX -d /dev/sgYY --lowCurrentSpinup disable

Repeat first info command at the end to confirm EPC is disabled. Cold boot to make sure all sorted.

Edited October 9, 2023 by Cessquill
Tweaked title to be more specific, tweaked text to reflect no issues for two months, tweaked to clarify entering command; added "./" to start of command as it's now required; added TLDR summary section at the end

JorgeB · March 11, 2021

Nice work, if you don't object I was thinking of moving this to the general guides section or it will likely drop from the 1st page here.

Cessquill · March 11, 2021

Just now, JorgeB said:

Nice work, if you don't object I was thinking of moving this to the general guides section or it will likely drop from the 1st page here.

Of course, yes. Partly seeking confirmation I hadn't done something seriously wrong - little bit out of my depth!

TDD · March 11, 2021

Thank you for the work bringing this together. There is an easy way to just target the disks you want to modify.

SeaChest_PowerControl_1100_11923_64 -s --onlySeagate

I believe most tools actually allow this -s switch. See screenshot. This allows you to skip the 'map' part and make this easier :-)!

Kev.

1b.png.4cd890cbe41ec1ee48f4e2e5ae394bd7.png

Cessquill · March 11, 2021

41 minutes ago, TDD said:

Thank you for the work bringing this together. There is an easy way to just target the disks you want to modify.

SeaChest_PowerControl_1100_11923_64 -s --onlySeagate

I believe most tools actually allow this -s switch. See screenshot. This allows you to skip the 'map' part and make this easier :-)!

Kev.

Thanks for that - I did see onlySeagate when trawling through the text doc manuals; forgot to go back to it (before I'd got SC working).

RockDawg · March 13, 2021

I am stuck on one thing. When I unzip the SeaChestUtilities.zip file and go to /Linux/Lin64/, there are 3 folders in there, no files. The folders are centos-7_aarch64, centos-7_x86_64 and ubuntu-20.04_x86_64. Which do I want?

Cessquill · March 13, 2021

13 minutes ago, RockDawg said:

I am stuck on one thing. When I unzip the SeaChestUtilities.zip file and go to /Linux/Lin64/, there are 3 folders in there, no files. The folders are centos-7_aarch64, centos-7_x86_64 and ubuntu-20.04_x86_64. Which do I want?

That's changed since I did it last week. I'm just starting to test, but @TDD or @JorgeB may be more help here

RockDawg · March 13, 2021

I tried the ubuntu files just for the info command and it worked. So I am going to try continuing with those.

Cessquill · March 13, 2021

Just now, RockDawg said:

I tried the ubuntu files just for the info command and it worked. So I am going to try continuing with those.

I'd have thought centos, but not being a Linux guy I'm not sure (or how much difference it makes). I'll update the post when it's clear.

RockDawg · March 13, 2021

I'm not a Linux guy either. At all. I just figured if it wasn't the right one it would throw an error. It seemed to work but the drive is still disabled after a reboot. I assume I still have to unassign, reassign and rebuild the drive? And these changes will merely keep it from going off line again?

Edited March 13, 2021 by RockDawg

Cessquill · March 13, 2021

Just now, RockDawg said:

I assume I still have to unassign, reassign and rebuild the drive? And these changes will merely keep it from going off line again?

Yes

RockDawg · March 13, 2021

Anyone know why the drive has to be rebuilt? Did the data get corrupted? So people with more than one (or 2 if they had dual parity) that went out at the same time lost data?

That's pretty scary that something like that could happen just by upgrading Unraid versions!

Edited March 13, 2021 by RockDawg

TDD · March 14, 2021

12 hours ago, Cessquill said:

That's changed since I did it last week. I'm just starting to test, but @TDD or @JorgeB may be more help here

Linux guy here. Use the Ubuntu ones. If they don't work for unknown reasons, I have an archive of the older tool set.

Kev.

JorgeB · March 14, 2021

13 hours ago, RockDawg said:

So people with more than one (or 2 if they had dual parity) that went out at the same time lost data?

Unraid only disables as many disks as there are parity devices.

Neejrow · March 14, 2021

I just want to say thank you very much for this guide. Made it very simple to fix. Hopefully this resolves my issue that I just ran into today!

Cessquill · March 14, 2021

Updated original post to reflect new structure of SeaChest Utilities zip file

Vitor Ventura · March 24, 2021

Hello there,

Server just random crash? Without any info in log?

Any pattern, simply crash?

I have that, and have 1 LSI controller and 1 ST8000VN004, and in Unraid RC version, doesnt happen, only on stable versions...

JorgeB · March 25, 2021

11 hours ago, Vitor Ventura said:

Server just random crash?

Likely unrelated to this issue.

Danny N · March 29, 2021

hi, so plobably not related to this issue but last week i made major changes to my system in order to add a ssd cache and a gpu for a vm for this i added 2 lsi 9207 cards via a asus hyper m.2 expander (due to only having 2 pci slot with 4 lanes or more one of with was already in use) anyways, a little complex but seems to work(ish) but having problems with the lsi cards ether dropping out and then chrashing, this is using 6x ST16000NM001G with 4 data drives and dual parity and this issue has so far caused 4 disks to become disabled, data 2 and 3 on the first party check after the changes, at around 5% completion, the second attempt worked and data 2 and 3 rebuilt successfully at this point i made a full backup and ran another party check to confirm it was running ok and chrash at approx 10% with parity 1 disabled at this point i sawpped the controllers around so now controller 2 had the hdds connected and controller 1 had the ssds ran parity rebuild for parity 1 disk and at around 4% chrash and data 3 again disabled, unforchanelty id only enabled logging on the second chrash and didnt realise these was only saved on ram so have no logs

im aware that ST16000NM001G is a not a ironwolf but read up that these are very similar to their 16tb ironwolf drives so maybe affected, i origianlly thought this was due to bent pins on the cpu with happened during this rebuild where i dropped the cpu after it attaching to the underside of the cooler and crushed it with the case while tiring to catch it, this affected 8 pins compleatly flattening them but according to the diagram on wiki chip these are for memory channel A and GND (pin 2 from the corner broke but this is only power) the cpu ran happly during stress test and is currently 2 hours through a mem test with 0 errors, so if this isnt the issue then i can only assume it to be the signal intrerty between the cpu and the 9207's which ill test by dropping the link speed down to gen 2 and hope this dont affect my 10gb nic

full system spec before
DATA: ST16000NM001G x6
cache- none
vm data - samsung 860 1tb via unassigned drives
docker data - sandisk 3d ultra 960gb via unassigned drives
these was connected via mobo ports and via a cheap sata card i had lying around in pciex1_1
GPU: 1660super for plex (in pciex16_1)
CPU: 3950X

mobo: asus b550-m
ram: 64gb corasir vengence (non ecc) @3600mhz
psu; corsair 850W RMX

case: fracal design node 804
with APC UPS 700VA

damaged pin details:
according to wiki chip (link to pic )

damaged pins was C39 - K39 (C39 - K38 fully flattened) and AP1 to AU1 was slightly bent but these, after repair B39 fell off as it was not only flattened but had achally folded in half and A39, C39 E39 and J39 still had a thin section on the top part of the pin right where it was bent, systems booted and passed CPU stress test ect, (didnt consider doing a mem test at this time)

full system spec after
DATA: ST16000NM001G x6
cache- 2x MX500 2tb
vm data - 2x samsung 860 1tb via pools
docker data - sandisk 3d ultra 960gb and samsung 860 1tb via pools

these are via 2x lsi 9207 in slots pciex16_1 via hpyer m.2 slot 2 and 3 with the HDDs in one card and the SDD's in the other card)

NIC: asus XG-C100C (in pciex16_1 via hpyer m.2 slot 4)
GPU: 1660super for plex (in pciex16_1 via hpyer m.2 slot 1)
GPU2: RX570 (intended for win 10 vm currently unsued in pciex16_2)
CPU: 3950X (nowwith bent and missing pins)
ram: 64gb corasir vengence (non ecc) @3600mhz

mobo: asus b550-m
psu; corsair 850W RMX
with APC UPS 700VA
case: fracal design node 804 (yeah its very tight build)

ill update if i find the issue (or get logs of it now i have those set up) but slim chance its related (still got at least 22 hours of mem test to go tho)

sorry for the long comment but more detail hopfully helps

Danny N · March 29, 2021

edit to the above: do have this log from after it finished the first rebuild and crashed disabling the parity disk tho it probably dont help much

syslog

Edited March 29, 2021 by Danny N
attched file was in the middle of the text

trurl · March 29, 2021

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

Danny N · March 29, 2021

ty, currently doing a memtest for a day so il do this tommow once i get back into unraid

3 minutes ago, trurl said:

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

Danny N · March 30, 2021

23 hours ago, trurl said:

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

this is only about a minute after bootup, hopefully it helps, for now im gonna try dropping the pcie link speed down to gen 2 to see if its the ribbon cables (the sheilded ones) for the hba cards

dnas-diagnostics-20210330-1804.zip

EDIT: Adding syslog

dnas-syslog-20210330-1711.zip

Edited March 30, 2021 by Danny N
adding syslog

Danny N · April 1, 2021

On 3/30/2021 at 6:06 PM, Danny N said:

this is only about a minute after bootup, hopefully it helps, for now im gonna try dropping the pcie link speed down to gen 2 to see if its the ribbon cables (the sheilded ones) for the hba cards

dnas-diagnostics-20210330-1804.zip 127.87 kB · 0 downloads

EDIT: Adding syslog

dnas-syslog-20210330-1711.zip 25.79 kB · 0 downloads

ok seems to be the pcie express link speed - dropping to gen 2 and now had a successfull parity rebuild on 2 drives and then a full praity check without error, this is the first time its done a parity check sucessfully and also didnt finish 2 operations back to back before ether, so gonna say this has nothing to do with my issue
EDIT: thanks for the help

Edited April 1, 2021 by Danny N
see edit tag

optiman · April 8, 2021

Thank you all for this thread, very helpful. I'm still on 6.8.3 and I have several Seagate ST8000NM0055 (standard 512E) firmware SN04, which are listed as Enterprise Capacity. I just checked and Seagate has a firmware update for this model, SN05 I also have several Seagate ST12000NE0008 Ironwolf Pro drives with firmware EN01, no firmware updates available. My controller is a LSI 9305-24i x8, bios P14 and firmware P16_IT. I've had zero issues, uptime 329 days.

I was thinking of using the Seagate provided usb linux bootable flash builder and boot to that and run the commands outside of unraid. Given I only have seagate drives, I will need to do them all. Has anyone tried this with success?

6.9.x, LSI Controllers & Ironwolf Disks Disabling - Summary & Fix

Recommended Posts

NOTE: There's a TL;DR section at the end of this post with required steps

TL;DR - Just the Steps

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Cessquill

krazijoe

Cessquill

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation