Cessquill Posted April 9, 2021 Author Posted April 9, 2021 15 hours ago, optiman said: I was thinking of using the Seagate provided usb linux bootable flash builder and boot to that and run the commands outside of unraid. Given I only have seagate drives, I will need to do them all. Has anyone tried this with success? I haven't tried the bootable Seagate utility, but I was assuming it would just load to a command prompt with the tools preinstalled. For me it was easier to go via Unraid (plus no downtime). In other news, I haven't had a single issue since applying the above (before I had three issues in about a week). Quote
optiman Posted April 9, 2021 Posted April 9, 2021 I'm trying to decide if I should update the fw on the 8tb drives or leave them on SN04. I haven't had any issues and they say we should not upgrade unless you are having issues. Advice please - upgrade SN05 on those 8tb drives first, or leave them and make the changes using TDD's instructions and then upgrade unraid? Quote
MisterWolfe Posted April 9, 2021 Posted April 9, 2021 I'm on 6.9.1 and have an LSI 9200-8i controller. No issues at all with my ironwolf drives, thankfully. I wonder if the issue is card version specific. Quote
Cessquill Posted April 9, 2021 Author Posted April 9, 2021 1 minute ago, MisterWolfe said: I'm on 6.9.1 and have an LSI 9200-8i controller. No issues at all with my ironwolf drives, thankfully. I wonder if the issue is card version specific. I only had issues with one model of IronWolf drives (mentioned in first post). All others were fine. Trouble is, out of 16 Ironwolf's, 4 were that model. Quote
TDD Posted April 9, 2021 Posted April 9, 2021 I only know of the exact 8TB unit in question that requires this tweak. I presented Seagate all the info on the issue but got meaningless responses back. Was hoping to chat with the hardware/firmware guys. We can only hope the intel makes it to where it needs to be. For note, my testing was done on both my LSI controllers and the same outcome was found prior to the fix: [1000:0064]01:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] (rev 02) [1000:0072]02:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) Kev. Quote
tdunks Posted April 13, 2021 Posted April 13, 2021 I have three St8000as002 drives that have started having this issue too but this fix did not work. All three have the exact same number of read errors only during parity checks and always near the end. I am not sure if there is another issue at work as well. Quote
YB96 Posted April 15, 2021 Posted April 15, 2021 Thanks for the work, but I need your advice if I am affected since I have slightly different symptoms. I have one ST8000VN004 drive and an LSI SAS 2008 controller with newest it mode firmware (at least it was in august). Μy problem is I have read errors on that 8tb ironwolf, only read errors. The drive does not get disabled, but the counter is increasing. The read Errors started about a week ago. I had the 6.9 beta since about november and no problems, except for one time where the exact same disk had a lot of errors which was probably due to a plug not fully inserted after adding a new drive. That time I had many UDMA CRC errors in smart which did not increase since i reseated the cable so this is probably not related(?). Shall I go ahead and apply this fix or do you guys think it´s unrelated? Logs attached. unraid-nick-diagnostics-20210416-0104.zip Quote
JorgeB Posted April 16, 2021 Posted April 16, 2021 7 hours ago, YB96 said: Μy problem is I have read errors on that 8tb ironwolf, only read errors. Error appear to always be during spin up, try the fix or see if disabling spin down for that disk helps. Quote
optiman Posted April 16, 2021 Posted April 16, 2021 Ok now we have several Seagate models affected, something is just not right here. Does anyone know what has actually changed and caused this? Can the Unraid team fix this in a future release or does this mean that for everyone running Seagate drives are at risk? Even if the fix works today, how do you know it will be ok in the next release? It seems there is a deeper issue here that must get addressed. With all of this, I'm staying 6.8.3 for now and continue to enjoy my trouble-free server. I don't have any spare drives to test with and data loss is not an option for me. A fix that does not involve messing with drive fw or options would much appreciated. Quote
JorgeB Posted April 16, 2021 Posted April 16, 2021 1 hour ago, optiman said: Can the Unraid team fix this in a future release or does this mean that for everyone running Seagate drives are at risk? No, if there's a problem it's with the LSI driver and some Seagates drives, either LSI or Seagate would need to fix it. Quote
Cessquill Posted April 16, 2021 Author Posted April 16, 2021 1 hour ago, optiman said: Even if the fix works today, how do you know it will be ok in the next release? Because it was a fault with either the drive or the controller, and the fix was a change to the drives settings. I understand that other systems have also had problems with this drive/controller combo. Any future upgrade could theoretically break a previously unfound issue with anything. If the manufacturers don't step up then I'd reconsider whether to use their hardware for server work in future. Just as I wouldn't set up a pfSense box using Realtek NICs. If it helps, I've had zero issues since reining in the drive's settings. Quote
YB96 Posted April 17, 2021 Posted April 17, 2021 (edited) 18 hours ago, JorgeB said: Error appear to always be during spin up, try the fix or see if disabling spin down for that disk helps. Thanks, today my drive got kicked from the array and I have to rebuild it. When I was following your guide I noticed something: EPC was already disabled, so this might indicate disabling the low currentt spinup is indeed required. Don´t know if it helps spotting something, this is my drive information. /dev/sg4 - ST8000VN004-2M2101 - *hidden* - ATA Model Number: ST8000VN004-2M2101 Serial Number: *hidden* Firmware Revision: SC60 World Wide Name: 5000C500CF63B876 Drive Capacity (TB/TiB): 8.00/7.28 Native Drive Capacity (TB/TiB): 8.00/7.28 Temperature Data: Current Temperature (C): 34 Highest Temperature (C): 59 Lowest Temperature (C): 19 Power On Time: 224 days 17 hours Power On Hours: 5393.00 MaxLBA: 15628053167 Native MaxLBA: 15628053167 Logical Sector Size (B): 512 Physical Sector Size (B): 4096 Sector Alignment: 0 Rotation Rate (RPM): 7200 Form Factor: 3.5" Last DST information: Time since last DST (hours): 3869.00 DST Status/Result: 0x0 DST Test run: 0x1 Long Drive Self Test Time: 12 hours 30 minutes Interface speed: Max Speed (Gb/s): 6.0 Negotiated Speed (Gb/s): 6.0 Annualized Workload Rate (TB/yr): 308.52 Total Bytes Read (TB): 150.15 Total Bytes Written (TB): 39.79 Encryption Support: Not Supported Cache Size (MiB): 256.00 Read Look-Ahead: Enabled Write Cache: Enabled Low Current Spinup: Ultra Low Enabled SMART Status: Unknown or Not Supported ATA Security Information: Supported Firmware Download Support: Full, Segmented, Deferred Specifications Supported: ACS-4 ACS-3 ACS-2 ATA8-ACS ATA/ATAPI-7 ATA/ATAPI-6 ATA/ATAPI-5 SATA 3.3 SATA 3.2 SATA 3.1 SATA 3.0 SATA 2.6 SATA 2.5 SATA II: Extensions SATA 1.0a ATA8-AST Features Supported: Sanitize SATA NCQ SATA Rebuild Assist SATA Software Settings Preservation [Enabled] SATA Device Initiated Power Management Power Management Security SMART [Enabled] 48bit Address PUIS GPL Streaming SMART Self-Test SMART Error Logging Write-Read-Verify DSN AMAC EPC Sense Data Reporting SCT Write Same SCT Error Recovery Control SCT Feature Control SCT Data Tables Host Logging Set Sector Configuration Seagate In Drive Diagnostics (IDD) Adapter Information: Vendor ID: 1000h Product ID: 0072h Revision: 0003h Edited April 17, 2021 by YB96 hide the S/N Quote
TDD Posted April 17, 2021 Posted April 17, 2021 There very well could be edge cases with other Ironwolf drives but assuredly it is an issue with the ST8000VN004. I would not bet on a timely, if ever, firmware update for the drive itself. The two changes make the drive more aggressive with its spinup and readyness to compensate for the driver timing out while waiting for its ready state. You have nothing to lose by making these changes as they are reversible; the amount of power saving is negligible IMHO and the benefits of a upgraded UnRAID are worth it. Try and see! Kev. Quote
jamikest Posted April 17, 2021 Posted April 17, 2021 I just wanted to give a quick THANK YOU for this post. I was receiving multiple errors when my 8TB Ironwolf drive would spin up. I went through cables, relocating on the controller, and finally trying a new 8TB Ironwolf drive. The issue persisted through all of my measures. Digging a bit deeper, I found this post. I tried the SeaChest commands and the spin up errors are resolved. For anyone else forum searching, here is the syslog output anytime a drive would spin up (sometimes with read errors in unraid, sometimes no read errors as in the example below): Apr 17 11:03:37 Tower emhttpd: spinning up /dev/sdc Apr 17 11:03:53 Tower kernel: sd 7:0:1:0: attempting task abort!scmd(0x000000009175e648), outstanding for 15282 ms & timeout 15000 ms Apr 17 11:03:53 Tower kernel: sd 7:0:1:0: [sdc] tag#1097 CDB: opcode=0x85 85 06 20 00 00 00 00 00 00 00 00 00 00 40 e3 00 Apr 17 11:03:53 Tower kernel: scsi target7:0:1: handle(0x0009), sas_address(0x4433221101000000), phy(1) Apr 17 11:03:53 Tower kernel: scsi target7:0:1: enclosure logical id(0x5c81f660d1f49300), slot(2) Apr 17 11:03:56 Tower kernel: sd 7:0:1:0: task abort: SUCCESS scmd(0x000000009175e648) Apr 17 11:03:56 Tower emhttpd: read SMART /dev/sdc After disabling low current and EPC, here is the result of spinning up the same drives (no errors!): Apr 17 12:08:42 Tower emhttpd: spinning up /dev/sdc Apr 17 12:08:51 Tower emhttpd: read SMART /dev/sdc 1 Quote
jamikest Posted April 19, 2021 Posted April 19, 2021 On 4/16/2021 at 11:48 PM, TDD said: There very well could be edge cases with other Ironwolf drives but assuredly it is an issue with the ST8000VN004. I would not bet on a timely, if ever, firmware update for the drive itself. The two changes make the drive more aggressive with its spinup and readyness to compensate for the driver timing out while waiting for its ready state. You have nothing to lose by making these changes as they are reversible; the amount of power saving is negligible IMHO and the benefits of a upgraded UnRAID are worth it. Try and see! Kev. Just to add to this a bit further - I am running both 4TB ST4000VN008 and 8TB ST8000VN004 Ironwolf drives. I have not had a single error on my (5) 4 TB drives since I started my build about ~5 months ago. As soon as I added an 8TB Ironwolf to my array, the errors started. One more thing I find interesting is that I swapped in an 8TB Ironwolf to my parity ~6 weeks ago and have no errors on that drive located in my parity. I am not sure why the parity drive behaves differently. I disabled EPC and low power spin up on all the 8TB drives (parity and array) and left the 4TB as is. Quote
TDD Posted April 19, 2021 Posted April 19, 2021 My 8TB Ironwolf was the sole Seagate and it was the parity that errored out. It all comes down to strictly how idle the drive is and spin-ups past that. Kev. Quote
Mason736 Posted April 28, 2021 Posted April 28, 2021 On 4/19/2021 at 10:30 AM, TDD said: My 8TB Ironwolf was the sole Seagate and it was the parity that errored out. It all comes down to strictly how idle the drive is and spin-ups past that. Kev. I can confirm this. Over since I changed the ST8000VN004 drives (4 of them) to never spin down, always be spun up, they have been fine, and have not dropped from the array. Quote
TDD Posted April 28, 2021 Posted April 28, 2021 I want to add that my solo 8TB drive, my parity, does spin down and up as needed and does not always spin. This fix does not affect any requests to go idle. Kev. Quote
edrohler Posted May 10, 2021 Posted May 10, 2021 WOW! Thank you for this thread. I have been scratching my head about this all week. I posted here. Instead of tweaking the drives, I am just going to disable the spin down delay for any drives in the enclosure and hope for an update to the driver. 1 Quote
TDD Posted May 10, 2021 Posted May 10, 2021 The full tweak allows spin downs gracefully so you give up nothing. Might even save a watt or two :-). Kev. Quote
Cessquill Posted May 10, 2021 Author Posted May 10, 2021 8 hours ago, edrohler said: WOW! Thank you for this thread. I have been scratching my head about this all week. I posted here. Instead of tweaking the drives, I am just going to disable the spin down delay for any drives in the enclosure and hope for an update to the driver. Just saw your post in the unbalance thread, and was about to suggest you check here. No need now 1 Quote
lgil Posted May 13, 2021 Posted May 13, 2021 Hi, I have followed the steps to disable EPC and Low Current Spinup, all commands finished without errors. Today morning one of my 2 disks ST8000VN004 (disk 1 - sdf), show a read disk error below: Ran a Smart Test and no errors, reboot Unraid and now show no errors to Disk 1. I really appreciate any help. Quote
kolepard Posted May 13, 2021 Posted May 13, 2021 I have 3 of these drives in my array, and since applying the fix, I have had no problems, and my array has been on 24/7 (currently over 44 days) with multiple spin ups and spin downs of all the drives. I am not close to the level of expertise of others here, but my guess is that this is a seperate, unrelated issue. Kevin Quote
lgil Posted May 15, 2021 Posted May 15, 2021 On 5/13/2021 at 7:22 PM, lgil said: Hi, I have followed the steps to disable EPC and Low Current Spinup, all commands finished without errors. Today morning one of my 2 disks ST8000VN004 (disk 1 - sdf), show a read disk error below: Ran a Smart Test and no errors, reboot Unraid and now show no errors to Disk 1. I really appreciate any help. An update I had setup back to the previous configuration and reboot my Unraid server. Before starting the array, I have tried one more time this process and is working now. May 15 05:11:49 honeysnas emhttpd: spinning down /dev/sde May 15 05:11:49 honeysnas emhttpd: spinning down /dev/sdf May 15 06:21:17 honeysnas emhttpd: read SMART /dev/sdf May 15 06:51:21 honeysnas emhttpd: spinning down /dev/sdf May 15 09:45:33 honeysnas emhttpd: read SMART /dev/sde May 15 09:45:33 honeysnas emhttpd: read SMART /dev/sdf May 15 10:17:25 honeysnas emhttpd: spinning down /dev/sde May 15 10:17:25 honeysnas emhttpd: spinning down /dev/sdf No read errors since yesterday afternoon Thank you @Cessquill, great work. 1 Quote
TangoEchoAlpha Posted June 5, 2021 Posted June 5, 2021 (edited) I am very glad that I stumbled across this thread - thanks @Cessquill ! I've got a LSI card ordered from Art Of Server that I am waiting to have delivered - I've gone for the LSI 9201-8i, which I understand is the same as the 9211-8i but without the IR mode NVRAM chip. I was just about to order some more drives and was about to hit the buy button on the Seagates. Does anyone know if this issue affects the ST8000NE001? This is the same drive as the ST8000VN004, just the former is the Ironwolf Pro and the latter the standard Ironwolf? I was about to buy the Pro drive, not because of the 'Pro' moniker but for the extra 2 years warranty. And given that the retailer does a 48 hour replacement service for the lifetime of the warranty, that extra two years could be of benefit - the price difference was only £10 GBP. Running Unraid 6.9.2 Thanks in advance 😎 Edited June 5, 2021 by TangoEchoAlpha 1 Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.