March 11, 20215 yr

NOTE: There's a TL;DR section at the end of this post with required steps

People with specific Seagate Ironwolf disks on LSI controllers have been having issues with Unraid 6.9.0 and 6.9.1. Typically when spinning up the drive could drop off the system. Getting it back on would require checking, unassigning, reassigning and rebuilding its contents (about 24 hours). It happened to me three times in a week across two of my four affected drives.

The drive in question is the 8TB Ironwolf ST8000VN004, although 10TB has been mentioned, so it may affect several.

There have been various comments and suggestions over the threads, and it appears that there is a workaround solution. The workaround is reversible, so if an official fix comes along you can revert your settings back. This thread is here to consolidate the great advice given by @TDD, @SimonF, @JorgeB and others to hopefully make it easier for people to follow.

This thread is also here to hopefully provide a central place for those with the same hardware combo to track developments.

NOTE: Carry out these steps at your own risk. Whilst I will list each step I did and it's all possible within Unraid, it's your data. Read through, and only carry anything out if you feel comfortable. I'm far from an expert - I'm just consolidating valuable information scattered - if this is doing more harm than good, or is repeated elsewhere, then close this off.

The solution involves making changes to the settings of the Ironwolf disk. This is done by running some Seagate command line utilities (SeaChest) explained by @TDD here

The changes we will be making are

Disable EPC
Disable Low Current Spinup (not confirmed if this is required)

The Seagate utilities refer to disks slightly differently than Unraid, but there is a way to translate one to the other, explained by @SimonF here

I have carried out these steps and it looks to have solved the issue for me. I've therefore listed them below in case it helps anybody. It is nowhere near as long-winded as it looks - I've just listed literally every step.

Note that I am not really a Linux person, so getting the Seagate utilities onto Unraid might look like a right kludge. If there's a better way, let me know. All work is carried out on a Windows machine. I use Notepad to help me prepare commands beforehand, I can construct each command first, then copy and paste it into the terminal.

If you have the option, make these changes before upgrading Unraid...

Part 1: Identify the disk(s) you need to work on

EDIT: See the end of this part for an alternate method of identifying the disks
1. Go down your drives list on the Unraid main tab. Note down the part in brackets next to any relevant disk (eg, sdg, sdaa, sdac, sdad)
2. Open up a Terminal window from the header bar in Unraid
3. Type the following command and press enter. This will give you a list of all drives with their sg and sd reference

sg_map

4. Note down the sg reference of each drive you identified in step 1 (eg, sdg=sg6, sdaa=sg26, etc.)

image.png.70d27cd49926e662e6bb5b7e77c64336.png

There is a second way to get the disk references which you may prefer. It uses SeaChest, so needs carrying out after Part 2 (below). @TDD explains it in this post here...

Part 2: Get SeaChest onto Unraid
NOTE: I copied SeaChest onto my Flash drive, and then into the tmp folder. There's probably a better way of doing this

EDIT: Since writing this the zip file to download has changed its structure, I've updated the instructions to match the new download.
5. Open your flash drive from Windows (eg \\tower\flash), create a folder called "seachest" and enter it
6. Go to https://www.seagate.com/gb/en/support/software/seachest/ and download "SeaChest Utilities"
7. Open the downloaded zip file and navigate to Linux\Lin64\ubuntu-20.04_x86_64\ (when this guide was written, it was just "Linux\Lin64". The naming of the ubuntu folder may change in future downloads)
8. Copy all files from there to the seachest folder on your flash drive

Now we need to move the seachest folder to /tmp. I used mc, but many will just copy over with a command. The rest of this part takes place in the Terminal window opened in step 2...

9. Open Midnight Commander by typing "mc"
10. Using arrows and enter, click the ".." entry on the left side
11. Using arrows and enter, click the "/boot" folder
12. Tab to switch to the right panel, use arrows and enter to click the ".."
13. Using arrows and enter, click the "/tmp" folder
14. Tab back to the left panel and press F6 and enter to move the seachest folder into tmp
15. F10 to exit Midnight Commander

Finally, we need to change to the seachest folder on /tmp and make these utilities executable...
16. Enter the following commands...

cd /tmp/seachest

...to change to your new seachest folder, and...

chmod +x SeaChest_*

...to make the files executable.

Part 3: Making the changes to your Seagate drive(s)

EDIT: When this guide was written, there was what looked like a version number at the end of each file, represented by XXXX below. Now each file has "_x86_64-linux-gnu" so where it mentions XXXX you need to replace with that.

This is all done in the Terminal window. The commands here have two things that may be different on your setup - the version of SeaChest downloaded (XXXX) and the drive you're working on (YY). This is where Notepad comes in handy - plan out all required commands first

17. Get the info about a drive...

./SeaChest_Info_XXXX -d /dev/sgYY -i

...in my case (as an example) "SeaChest_Info_150_11923_64 -d /dev/sg6 -i"

You should notice that EPC has "enabled" next to it and Low Current Spinup is enabled

18. Disable EPC...

./SeaChest_PowerControl_XXXX -d /dev/sgYY --EPCfeature disable

...for example "SeaChest_PowerControl_1100_11923_64 -d /dev/sg6 --EPCfeature disable"

19. Repeat step 17 to confirm EPC is now disabled
20. Repeat steps 17-19 for any other disks you need to set

21. Disable Low Current Spinup...:

./SeaChest_Configure_XXXX -d /dev/sgYY --lowCurrentSpinup disable

...for example "SeaChest_Configure_1170_11923_64 -d /dev/sg6 --lowCurrentSpinup disable"
It is not possible to check this without rebooting, but if you do not get any errors it's likely to be fine.
22. Repeat step 21 for any other disks

You should now be good to go. Once this was done (took about 15 minutes) I rebooted and then upgraded from 6.8.3 to 6.9.1. It's been fine since when before I would get a drive drop off every few days. Make sure you have a full backup of 6.8.3, and don't make too many system changes for a while in case you need to roll back.

Seachest will be removed when you reboot the system (as it's in /tmp). If you want to retain it on your boot drive, Copy to /tmp instead of moving it. You will need to copy it off /boot to run it each time, as you need to make it executable.

Completely fine if you want to hold off for an official fix. I'm not so sure it will be a software fix though, since it affects these specific drives only. It may be a firmware update for the drive, which may just make similar changes to above.

As an afterthought, looking through these Seagate utilities, it might be possible to write a user script to completely automate this. Another alternative is to boot onto a linux USB and run it outside of Unraid (would be more difficult to identify drives).

***********************************************

TL;DR - Just the Steps

I've had to do this several times myself and wanted somewhere to just get all the commands I'll need...

Get all /dev/sgYY numbers from list (compared to dashboard disk assignments)...

sg_map

Download seachest from https://www.seagate.com/gb/en/support/software/seachest/

Extract and copy seachest folder to /tmp

Change to seachest and make files executable...

cd /tmp/seachest
chmod +x SeaChest_*

For each drive you need to change (XXXX is suffix in seachest files, YY is number obtained from above)...

./SeaChest_Info_XXXX -d /dev/sgYY -i
./SeaChest_PowerControl_XXXX -d /dev/sgYY --EPCfeature disable
./SeaChest_Configure_XXXX -d /dev/sgYY --lowCurrentSpinup disable

Repeat first info command at the end to confirm EPC is disabled. Cold boot to make sure all sorted.

Edited October 9, 20232 yr by Cessquill
Tweaked title to be more specific, tweaked text to reflect no issues for two months, tweaked to clarify entering command; added "./" to start of command as it's now required; added TLDR summary section at the end

Quote

6
15

March 11, 20215 yr

Nice work, if you don't object I was thinking of moving this to the general guides section or it will likely drop from the 1st page here.

Quote

March 11, 20215 yr

Author

Just now, JorgeB said:

Nice work, if you don't object I was thinking of moving this to the general guides section or it will likely drop from the 1st page here.

Of course, yes. Partly seeking confirmation I hadn't done something seriously wrong - little bit out of my depth!

Quote

March 11, 20215 yr

Thank you for the work bringing this together. There is an easy way to just target the disks you want to modify.

SeaChest_PowerControl_1100_11923_64 -s --onlySeagate

I believe most tools actually allow this -s switch. See screenshot. This allows you to skip the 'map' part and make this easier :-)!

Kev.

1b.png.4cd890cbe41ec1ee48f4e2e5ae394bd7.png

Quote

March 11, 20215 yr

Author

41 minutes ago, TDD said:

Thank you for the work bringing this together. There is an easy way to just target the disks you want to modify.

SeaChest_PowerControl_1100_11923_64 -s --onlySeagate

I believe most tools actually allow this -s switch. See screenshot. This allows you to skip the 'map' part and make this easier :-)!

Kev.

Thanks for that - I did see onlySeagate when trawling through the text doc manuals; forgot to go back to it (before I'd got SC working).

Quote

March 13, 20215 yr

I am stuck on one thing. When I unzip the SeaChestUtilities.zip file and go to /Linux/Lin64/, there are 3 folders in there, no files. The folders are centos-7_aarch64, centos-7_x86_64 and ubuntu-20.04_x86_64. Which do I want?

Quote

March 13, 20215 yr

Author

13 minutes ago, RockDawg said:

I am stuck on one thing. When I unzip the SeaChestUtilities.zip file and go to /Linux/Lin64/, there are 3 folders in there, no files. The folders are centos-7_aarch64, centos-7_x86_64 and ubuntu-20.04_x86_64. Which do I want?

That's changed since I did it last week. I'm just starting to test, but @TDD or @JorgeB may be more help here

Quote

March 13, 20215 yr

I tried the ubuntu files just for the info command and it worked. So I am going to try continuing with those.

Quote

March 13, 20215 yr

Author

Just now, RockDawg said:

I tried the ubuntu files just for the info command and it worked. So I am going to try continuing with those.

I'd have thought centos, but not being a Linux guy I'm not sure (or how much difference it makes). I'll update the post when it's clear.

Quote

March 13, 20215 yr

I'm not a Linux guy either. At all. I just figured if it wasn't the right one it would throw an error. It seemed to work but the drive is still disabled after a reboot. I assume I still have to unassign, reassign and rebuild the drive? And these changes will merely keep it from going off line again?

Edited March 13, 20215 yr by RockDawg

Quote

March 13, 20215 yr

Author

Just now, RockDawg said:

I assume I still have to unassign, reassign and rebuild the drive? And these changes will merely keep it from going off line again?

Yes

Quote

March 13, 20215 yr

Anyone know why the drive has to be rebuilt? Did the data get corrupted? So people with more than one (or 2 if they had dual parity) that went out at the same time lost data?

That's pretty scary that something like that could happen just by upgrading Unraid versions!

Edited March 13, 20215 yr by RockDawg

Quote

March 14, 20215 yr

12 hours ago, Cessquill said:

That's changed since I did it last week. I'm just starting to test, but @TDD or @JorgeB may be more help here

Linux guy here. Use the Ubuntu ones. If they don't work for unknown reasons, I have an archive of the older tool set.

Kev.

Quote

1

March 14, 20215 yr

13 hours ago, RockDawg said:

So people with more than one (or 2 if they had dual parity) that went out at the same time lost data?

Unraid only disables as many disks as there are parity devices.

Quote

1

March 14, 20215 yr

I just want to say thank you very much for this guide. Made it very simple to fix. Hopefully this resolves my issue that I just ran into today!

Quote

1

March 14, 20215 yr

Author

Updated original post to reflect new structure of SeaChest Utilities zip file

Quote

3

March 24, 20215 yr

Hello there,

Server just random crash? Without any info in log?

Any pattern, simply crash?

I have that, and have 1 LSI controller and 1 ST8000VN004, and in Unraid RC version, doesnt happen, only on stable versions...

Quote

March 25, 20215 yr

11 hours ago, Vitor Ventura said:

Server just random crash?

Likely unrelated to this issue.

Quote

March 29, 20215 yr

hi, so plobably not related to this issue but last week i made major changes to my system in order to add a ssd cache and a gpu for a vm for this i added 2 lsi 9207 cards via a asus hyper m.2 expander (due to only having 2 pci slot with 4 lanes or more one of with was already in use) anyways, a little complex but seems to work(ish) but having problems with the lsi cards ether dropping out and then chrashing, this is using 6x ST16000NM001G with 4 data drives and dual parity and this issue has so far caused 4 disks to become disabled, data 2 and 3 on the first party check after the changes, at around 5% completion, the second attempt worked and data 2 and 3 rebuilt successfully at this point i made a full backup and ran another party check to confirm it was running ok and chrash at approx 10% with parity 1 disabled at this point i sawpped the controllers around so now controller 2 had the hdds connected and controller 1 had the ssds ran parity rebuild for parity 1 disk and at around 4% chrash and data 3 again disabled, unforchanelty id only enabled logging on the second chrash and didnt realise these was only saved on ram so have no logs

im aware that ST16000NM001G is a not a ironwolf but read up that these are very similar to their 16tb ironwolf drives so maybe affected, i origianlly thought this was due to bent pins on the cpu with happened during this rebuild where i dropped the cpu after it attaching to the underside of the cooler and crushed it with the case while tiring to catch it, this affected 8 pins compleatly flattening them but according to the diagram on wiki chip these are for memory channel A and GND (pin 2 from the corner broke but this is only power) the cpu ran happly during stress test and is currently 2 hours through a mem test with 0 errors, so if this isnt the issue then i can only assume it to be the signal intrerty between the cpu and the 9207's which ill test by dropping the link speed down to gen 2 and hope this dont affect my 10gb nic

full system spec before
DATA: ST16000NM001G x6
cache- none
vm data - samsung 860 1tb via unassigned drives
docker data - sandisk 3d ultra 960gb via unassigned drives
these was connected via mobo ports and via a cheap sata card i had lying around in pciex1_1
GPU: 1660super for plex (in pciex16_1)
CPU: 3950X

mobo: asus b550-m
ram: 64gb corasir vengence (non ecc) @3600mhz
psu; corsair 850W RMX

case: fracal design node 804
with APC UPS 700VA

damaged pin details:
according to wiki chip (link to pic )

damaged pins was C39 - K39 (C39 - K38 fully flattened) and AP1 to AU1 was slightly bent but these, after repair B39 fell off as it was not only flattened but had achally folded in half and A39, C39 E39 and J39 still had a thin section on the top part of the pin right where it was bent, systems booted and passed CPU stress test ect, (didnt consider doing a mem test at this time)

full system spec after
DATA: ST16000NM001G x6
cache- 2x MX500 2tb
vm data - 2x samsung 860 1tb via pools
docker data - sandisk 3d ultra 960gb and samsung 860 1tb via pools

these are via 2x lsi 9207 in slots pciex16_1 via hpyer m.2 slot 2 and 3 with the HDDs in one card and the SDD's in the other card)

NIC: asus XG-C100C (in pciex16_1 via hpyer m.2 slot 4)
GPU: 1660super for plex (in pciex16_1 via hpyer m.2 slot 1)
GPU2: RX570 (intended for win 10 vm currently unsued in pciex16_2)
CPU: 3950X (nowwith bent and missing pins)
ram: 64gb corasir vengence (non ecc) @3600mhz

mobo: asus b550-m
psu; corsair 850W RMX
with APC UPS 700VA
case: fracal design node 804 (yeah its very tight build)

ill update if i find the issue (or get logs of it now i have those set up) but slim chance its related (still got at least 22 hours of mem test to go tho)

sorry for the long comment but more detail hopfully helps

Quote

March 29, 20215 yr

edit to the above: do have this log from after it finished the first rebuild and crashed disabling the parity disk tho it probably dont help much

syslog

Edited March 29, 20215 yr by Danny N
attched file was in the middle of the text

Quote

March 29, 20215 yr

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

Quote

March 29, 20215 yr

ty, currently doing a memtest for a day so il do this tommow once i get back into unraid

3 minutes ago, trurl said:

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

Quote

March 30, 20215 yr

23 hours ago, trurl said:

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread

this is only about a minute after bootup, hopefully it helps, for now im gonna try dropping the pcie link speed down to gen 2 to see if its the ribbon cables (the sheilded ones) for the hba cards

dnas-diagnostics-20210330-1804.zip

EDIT: Adding syslog

dnas-syslog-20210330-1711.zip

Edited March 30, 20215 yr by Danny N
adding syslog

Quote

April 1, 20215 yr

On 3/30/2021 at 6:06 PM, Danny N said:

this is only about a minute after bootup, hopefully it helps, for now im gonna try dropping the pcie link speed down to gen 2 to see if its the ribbon cables (the sheilded ones) for the hba cards

dnas-diagnostics-20210330-1804.zip 127.87 kB · 0 downloads

EDIT: Adding syslog

dnas-syslog-20210330-1711.zip 25.79 kB · 0 downloads

ok seems to be the pcie express link speed - dropping to gen 2 and now had a successfull parity rebuild on 2 drives and then a full praity check without error, this is the first time its done a parity check sucessfully and also didnt finish 2 operations back to back before ether, so gonna say this has nothing to do with my issue
EDIT: thanks for the help

Edited April 1, 20215 yr by Danny N
see edit tag

Quote

April 8, 20215 yr

Thank you all for this thread, very helpful. I'm still on 6.8.3 and I have several Seagate ST8000NM0055 (standard 512E) firmware SN04, which are listed as Enterprise Capacity. I just checked and Seagate has a firmware update for this model, SN05 I also have several Seagate ST12000NE0008 Ironwolf Pro drives with firmware EN01, no firmware updates available. My controller is a LSI 9305-24i x8, bios P14 and firmware P16_IT. I've had zero issues, uptime 329 days.

I was thinking of using the Seagate provided usb linux bootable flash builder and boot to that and run the commands outside of unraid. Given I only have seagate drives, I will need to do them all. Has anyone tried this with success?

Quote

6.9.x, LSI Controllers & Ironwolf Disks Disabling - Summary & Fix

Featured Replies

NOTE: There's a TL;DR section at the end of this post with required steps

TL;DR - Just the Steps

Top Posters In This Topic

Popular Days

Most Popular Posts

krazijoe

Cessquill

unraidwok

Posted Images

Join the conversation

Top Posters In This Topic

Popular Days

Most Popular Posts

krazijoe

Cessquill

unraidwok

Posted Images

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)