March 5Mar 5 I've got several drives in my array, six of which are the same model from the same place, and I'm seeing a lot of errorsMar 2 11:42:21 Innsmouth kernel: md: disk8 write error, sector=2000 Mar 2 11:42:21 Innsmouth kernel: md: disk8 write error, sector=2008 Mar 2 11:42:21 Innsmouth kernel: md: disk8 write error, sector=2016 Mar 2 11:42:21 Innsmouth kernel: md: disk8 write error, sector=2024 Mar 2 11:42:21 Innsmouth kernel: md: disk8 write error, sector=2032 Mar 2 11:45:29 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:46:02 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:47:02 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:48:02 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:49:02 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:49:57 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:50:00 Innsmouth kernel: Buffer I/O error on dev md8p1, logical block 0, async page read Mar 2 11:50:11 Innsmouth kernel: critical target error, dev sdh, sector 1856 op 0x1:(WRITE) flags 0x4000 phys_seg 64 prio class 0 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1792 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1800 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1808 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1816 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1824 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1832 Mar 2 11:50:11 Innsmouth kernel: md: disk8 write error, sector=1840Here is the SMART report for one of the drivessmartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.12.54-Unraid] (local build) Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: IBM-XIV Product: ST6000NM0054 D5 Revision: EC6D Compliance: SPC-4 User Capacity: 6,001,175,122,432 bytes [6.00 TB] Logical block size: 512 bytes Physical block size: 4096 bytes LU is fully provisioned Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000c500845ddc87 Serial number: Z4D39C310000R608MHK6 Device type: disk Transport protocol: SAS (SPL-4) Local Time is: Tue Mar 3 01:29:25 2026 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled Read Cache is: Enabled Writeback Cache is: Disabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Grown defects during certification = 0 Total blocks reassigned during format = 0 Total new blocks reassigned = 0 Power on minutes since format = 1690834 Current Drive Temperature: 29 C Drive Trip Temperature: 65 C Elements in grown defect list: 0 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 1949493561 0 0 1949493561 0 850636.856 0 write: 0 0 2 2 2 286897.910 0 verify: 178262958 0 0 178262958 0 8444.860 0 Non-medium error count: 523 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background long Completed - 64410 - [- - -] # 2 Background short Completed - 64368 - [- - -] # 3 Background long Completed - 64285 - [- - -] # 4 Background short Completed - 64257 - [- - -] # 5 Background short Completed - 64246 - [- - -] # 6 Background short Completed - 36637 - [- - -] # 7 Background short Completed - 36565 - [- - -] # 8 Background short Completed - 36469 - [- - -] # 9 Background short Completed - 36416 - [- - -] #10 Background long Completed - 81 - [- - -] #11 Background short Completed - 67 - [- - -] #12 Background short Aborted (by user command) - 45 - [- - -] Long (extended) Self-test duration: 38632 seconds [10.7 hours] Background scan results log Status: waiting until BMS interval timer expires Accumulated power on time, hours:minutes 64524:51 [3871491 minutes] Number of background scans performed: 329, scan progress: 0.00% Number of background medium scans performed: 329 Device does not support General statistics and performance logging Protocol Specific port log page for SAS SSP relative target port id = 1 generation code = 0 number of phys = 1 phy identifier = 0 attached device type: expander device attached reason: SMP phy control function reason: loss of dword synchronization negotiated logical link rate: phy enabled; 6 Gbps attached initiator port: ssp=0 stp=0 smp=0 attached target port: ssp=0 stp=0 smp=1 SAS address = 0x5000c500845ddc85 attached SAS address = 0x500056b39f5e9dff attached phy identifier = 6 Invalid DWORD count = 28 Running disparity error count = 28 Loss of DWORD synchronization count = 14 Phy reset problem count = 10 relative target port id = 2 generation code = 0 number of phys = 1 phy identifier = 1 attached device type: no device attached attached reason: unknown reason: unknown negotiated logical link rate: phy enabled; unknown attached initiator port: ssp=0 stp=0 smp=0 attached target port: ssp=0 stp=0 smp=0 SAS address = 0x5000c500845ddc86 attached SAS address = 0x0 attached phy identifier = 0 Invalid DWORD count = 0 Running disparity error count = 0 Loss of DWORD synchronization count = 0 Phy reset problem count = 0 I don't believe I see anything out of place in the smart reportThese are all 6tb SAS drives, the rest are SATA of varying sizes. I have them installed in an R730XDOne drive has been kicked from the array twice, and two of them were kicked a couple days ago. There are no errors in the lifecycle logs on the iDRAC and all drives are showing as good.All of the SAS drives have given errors, but only three have been kicked out of the array (one of them twice now). Swapping a sata drive into one of the failed SAS drives gives no errors, so it does not appear to be specific to a slot or cable unless there is a specific issue that a SAS drive would hit that a SATA drive wouldn't.Can you help shed a little light on what may be happening here, since I get the feeling they're going to push back a bit on replacing 6 drives
March 5Mar 5 Community Expert Errors in multiple disks at the same time are typically a power/conenction issue:Mar 5 00:36:09 Innsmouth kernel: md: disk3 read error, sector=11669581064Mar 5 00:36:09 Innsmouth kernel: md: disk1 read error, sector=11669580824Mar 5 00:36:09 Innsmouth kernel: md: disk4 read error, sector=11669580824Do they share anything in common, like a power splitter or miniSAS cable?
March 5Mar 5 Author I have 12 slots in the front, I believe there are two power and two SAS cables from the backplane to the mainboard. Initially, all the SAS drives were in slots 0-5 which would have shared a connection, but I moved one to slot 6 and still had issues. If it was an issue with a cable, shouldn't the SATA drives throw errors too?
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.