August 24, 20232 yr Hi All, Looking to see if you can help me target what is causing my disk write error issues. I have an Unraid array (exact specs will be below) that has been recently upgraded with an LSI 9207-8i HBA and 4 brand new 16 TB Seagate EXOS X16 drives along with 3 preexisting 14 TB WD drives and a 512 GB SATA3 Samsung 840 EVO SSD. This array is primarily used for Plex movies/TV but I also store ISOs and other common files on the SSD and have one of the 14 TB WDs as a backup drive for Veeam and security cam footage. Before I added the 4x16 TB HDDs I was running the 3x14 TB HDDs for Plex + a few smaller drives for some backup data. This setup had 0 issues despite running on one of the cheap crappy SATA expander cards that I know are heavily disliked and prone to issues. When I added the 4x16 TB HDDs to the array I saw some issues using onboard SATA + expander ports so I upgraded to the 9207-8i and moved ALL of my SATA drives to it. Things have worked well, speeds are great and I have had no reliability issues, until a few days ago. 4 days ago I saw one of my 16 TB Parity drives failed, had ~2000 disk errors in a single second and all read like: "Tower kernel: md: disk29 read error, sector=7675180088" and it was dropped from the array. I immediately ran a SMART short and then extended test and both came back with 0 issues. I chocked it up to a random error and this morning finally stopped the array, removed the disabled parity drive, started, stopped, readded parity drive, started, ran parity check. While the parity check has been running and that drive has been fine so far, now a different 16 TB drive (Part of my Movies share and very little data on it so far) now has ~2000 errors on it (same exact messages in logs but obviously specific to this different disk), has disabled itself, and also comes back fine via both short and extended SMART checks. I'm unsure of what is causing this and have been happy with the performance of my array lately but unsure why these random (false?) issues are coming from. Please let me know if I'm missing any information and I will happily get it added! I've attached my current diagnostics file and specs of my array are below: Hardware: Intel I3-8100 CPU 2x8GB DDR4 Memory ASRock B365 Pro4 Mobo 16 GB USB3.0 Boot Flash Drive Intel 82576 2-port gigabit Ethernet PCI-E NIC Intel 82599ES 1-port 10Gb SFP+ PCI-E NIC (ETH0/primary connection to network) LSI 9207-8i PCI-E HBA + 2x4 port mini-SAS to SATA breakout cables 500 GB Samsung 840 EVO SATA SSD 512 GB Inland Professional NVME SSD 4x16TB ST16000NM001G Seagate EXOS X16 SATA HDD 3x14TB WD140EDGZ Western Digital SATA HDD Corsair CX500 PSU Disks/Shares: Parity - 2x16 TB Seagate (Parity 1/sdg + Parity 2/sdd) SSD - 500 GB Samsung SATA SSD (Disk 1/sdh) Movies - 14 TB WD(Disk 3/sdi) + 16 TB Seagate (Disk 2/sde - ERROR DISK) TV - 14 TB WD(Disk 5/sdc) + 16 TB Seagate (Disk 4/sdf) Backups - 14 TB WD(Disk 8/sdb) Pool/Flash Cache - 512 GB Inland NVME (nvme0n1) Thanks in advance for the help! tower-diagnostics-20230824-1142.zip Edited August 24, 20232 yr by nicholasdaa added array and shares screenshot
August 24, 20232 yr Community Expert Solution It's not logged as a disk problem with either disk, start by updating the LSI firmware: LSISAS2308: FWVersion(20.00.06.00) All 20.00.xx releases except 20.00.07 have known issues.
August 24, 20232 yr Author Ah, that's helpful, thanks! What version SHOULD I update to, is .07 preferred or is there a better version that suits Unraid/the HBA better? Appreciate the instant reply!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.