November 26, 2025Nov 26 Hello, I do not currently have physical access to my array but I noticed a few issues, I received these emails I think this is the correct order.Event: Unraid array errorsSubject: Warning [X] - array has errorsDescription: Array has 1 disk with read errorsImportance: warningDisk 1 - WDC_WD100EZAZ (sdi) (errors 64616)I have Parity and 5 disks, and then I think it died a parity sync but the parity device got disabled so I could only do read check but that succeeded. So I tried to add back the parity by turning off the array and then removing my parity, starting in maintenance and then adding back but now it looks like this. I believe it may be a cabling issue that I touched the cables before as UDMA and reallocated count went up, but at the same time does the read errors mean it failed to read my data drive for the parity?As I killed the parity drive by reassigning it and I no longer have dual parity I am not sure what I can do now to check further the issue. Right now it is currently still rebuilding.This is the drive 1 with read errors smartctl -a /dev/sdismartctl 7.5 2025-04-30 r5714 [x86_64-linux-6.12.54-Unraid] (local build)Copyright (C) 2002-25, Bruce Allen, Christian Franke, www.smartmontools.org=== START OF INFORMATION SECTION ===Model Family: Western Digital Ultrastar (He10/12)Device Model: WDC WD100EZAZ-11TDBA0Serial Number: XLU WWN Device Id: 5 000cca 267dc3d79Firmware Version: 83.H0A83User Capacity: 10,000,831,348,736 bytes [10.0 TB]Sector Sizes: 512 bytes logical, 4096 bytes physicalRotation Rate: 5400 rpmForm Factor: 3.5 inchesDevice is: In smartctl database 7.5/5988ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)Local Time is: Wed Nov 26 08:41:17 2025 CETSMART support is: Available - device has SMART capability.SMART support is: Enabled=== START OF READ SMART DATA SECTION ===SMART overall-health self-assessment test result: PASSEDGeneral SMART Values:Offline data collection status: (0x80) Offline data collection activity was never started. Auto Offline Data Collection: Enabled.Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run.Total time to complete Offlinedata collection: ( 93) seconds.Offline data collectioncapabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported.SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.Error logging capability: (0x01) Error logging supported. General Purpose Logging supported.Short self-test routinerecommended polling time: ( 2) minutes.Extended self-test routinerecommended polling time: ( 996) minutes.SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported.SMART Attributes Data Structure revision number: 16Vendor Specific SMART Attributes with Thresholds:ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0004 131 131 054 Old_age Offline - 104 3 Spin_Up_Time 0x0007 253 253 024 Pre-fail Always - 124 (Average 357) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 204 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000a 100 100 067 Old_age Always - 0 8 Seek_Time_Performance 0x0004 128 128 020 Old_age Offline - 18 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 57099 10 Spin_Retry_Count 0x0012 100 100 060 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 200 22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100192 Power-Off_Retract_Count 0x0032 067 067 000 Old_age Always - 40367193 Load_Cycle_Count 0x0012 067 067 000 Old_age Always - 40367194 Temperature_Celsius 0x0002 147 147 000 Old_age Always - 44 (Min/Max 9/56)196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0SMART Error Log Version: 1No Errors LoggedSMART Self-test log structure revision number 1Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error# 1 Extended offline Completed without error 00% 54947 -# 2 Short offline Completed without error 00% 54931 -# 3 Short offline Completed without error 00% 54828 -# 4 Short offline Completed without error 00% 53194 -SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testingSelective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk.If Selective self-test is pending on power-up, resume after 0 minute delay.The above only provides legacy SMART information - try 'smartctl -x' for more Edited November 26, 2025Nov 26 by Grohmand
November 26, 2025Nov 26 Author yue-diagnostics-20251126-1505.zipHere you go! Though I did restart the server once so I am not sure if it contains the original retention, but I believe I have not done any operations to data drive from parity or I hope so at least, is it possibly the parity could have overwritten the data drive or does it only work the other way around? Edited November 26, 2025Nov 26 by Grohmand
November 26, 2025Nov 26 Community Expert It's not logged as a disk problem, check/repalce cables for disk1 and try again.
November 26, 2025Nov 26 Author Will do so when I can, is there any issues with rebuilding parity now or will it try to recover read-errors and still calculate parity? And is there at any time of point that I could have overwritten the data disk with my actions?
November 26, 2025Nov 26 Community Expert I would abort the rebuild and try again from the beginning, or you will need to run a correcting check after.
November 26, 2025Nov 26 Author There is no rebuild, the parity drive failed though, is that cause of the data drive 1? Cause the parity drive was disabled for me, and that's the one I had to recreate so it is rebuilding parity now
November 26, 2025Nov 26 Community Expert 52 minutes ago, Grohmand said:There is no rebuildI meant the parity sync that was going on:Nov 26 05:08:26 Yue kernel: md: recovery thread: recon P ...52 minutes ago, Grohmand said:the parity drive failed though, is that cause of the data drive 1?Those would be two separate issues, the diags posted show only issues with disk1, is parity missing or just invalid? Also, post new diags after a reboot and array start.
November 27, 2025Nov 27 Author 11 hours ago, JorgeB said:I meant the parity sync that was going on:Nov 26 05:08:26 Yue kernel: md: recovery thread: recon P ...Those would be two separate issues, the diags posted show only issues with disk1, is parity missing or just invalid? Also, post new diags after a reboot and array start.The initial issue as I understand was that the parity drive became disabled, and then I noticed there was read errors, but I did a read sync after parity and same errors.The parity is now done but there was read errors, so can I consider this parity okay? Or is there still a possibility that the issue corrupted the other drives?
November 27, 2025Nov 27 Author Parity was disabled though I am not sure if read errors from disk 1 caused it to be disabled, so I unassigned it, started array maintenance and stopped array and added it and parity rebuild, and now it is done. And we have read errors on drive 1 currently.I can physically access the machine next week so I will try do a live memory test to rule out RAM errors also. Edited November 27, 2025Nov 27 by Grohmand
November 27, 2025Nov 27 Community Expert 4 hours ago, Grohmand said:The parity is now done but there was read errors, so can I consider this parity okay?Nope.4 hours ago, Grohmand said:And we have read errors on drive 1 currently.Did you replace its cables and try again? If yes, post new diags after the new errors.
December 1, 2025Dec 1 Author On 11/27/2025 at 8:48 AM, JorgeB said:Nope.Did you replace its cables and try again? If yes, post new diags after the new errors.yue-diagnostics-20251201-0858.zip I restarted and reseated all cables, and I notice now that disk 1 is disabled and content is emulated, but we know disk 1 had read errors, so should I now recreate the array and recreate parity or what should next step be?
December 1, 2025Dec 1 Community Expert If parity was rebuilt with errors on disk1, it won't be valid. Recommend replacing the cables, then doing a new config to re-enable disk1 and try building parity again.
December 1, 2025Dec 1 Author Thank you I will do so, and if it doesn't work I will also source new cables, I currently have a LSI SAS card flashed as passthrough with 2x4 cables so it would take some time to arrange new replacements for that. I will update once this is done
December 2, 2025Dec 2 Author yue-diagnostics-20251202-0539.zipThe rebuild is done and no read errors ,so I assume it is okay now?
January 15Jan 15 Community Expert It's not logged as a disk problem, and SMART looks fine, so most likely it's a power/connection issue. Recommend checking/replacing cables and, assuming the emulated disk is still mounting and contents look correct, rebuilding on top
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.