December 6, 20223 yr Turned my server on today and straight away disk 4 was disabled after 1872 read errors. SMART report won't run as it says "A mandatory smart command failed. exiting. To continue, add one or more '-T permissive' options." Attached diagnostics. Could it be dodgy cable or disk needs replacing? Would I be ok to try pre-clear the disk and re-add it as disk4 if it passes? It's not showing up in preclear at the moment, but I assume I'll need to stop the array and remove it from slot 4 assignment first. Thanks for any help. tower-diagnostics-20221206-1622.zip
December 6, 20223 yr Community Expert Two of the onboard SATA ports are set to IDE, go into the BIOS and change them to SATA/AHCI, it's a known problem with these AMD chipsets.
December 7, 20223 yr Author 9 hours ago, JorgeB said: Two of the onboard SATA ports are set to IDE, go into the BIOS and change them to SATA/AHCI, it's a known problem with these AMD chipsets. Thanks JorgeB. All 6 mobo SATA ports were showing as IDE in the BIOS. Pretty sure they were set to AHCI previously. I've corrected that and checked the cables are firmly in. Started preclearing it but got loads of errors. Attached diagnostics again. Disk is now showing in Historical Devices list. Example logs: Dec 7 02:00:25 Tower kernel: blk_update_request: I/O error, dev sdr, sector 12085568 op 0x0:(READ) flags 0x80700 phys_seg 24 prio class 0 Dec 7 02:00:25 Tower kernel: sd 10:0:0:0: [sdr] tag#4 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=92s Dec 7 02:00:25 Tower kernel: sd 10:0:0:0: [sdr] tag#4 Sense Key : 0x2 [current] Dec 7 02:00:25 Tower kernel: sd 10:0:0:0: [sdr] tag#4 ASC=0x4 ASCQ=0x21 Dec 7 02:00:25 Tower kernel: sd 10:0:0:0: [sdr] tag#4 CDB: opcode=0x88 88 00 00 00 00 00 00 b8 6a 00 00 00 05 40 00 00 Dec 7 02:00:25 Tower kernel: blk_update_request: I/O error, dev sdr, sector 12087808 op 0x0:(READ) flags 0x80000 phys_seg 64 prio class 0 Dec 7 02:00:25 Tower kernel: blk_update_request: I/O error, dev sdr, sector 12084224 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Dec 7 02:00:25 Tower kernel: Buffer I/O error on dev sdr, logical block 1510528, async page read Dec 7 02:00:33 Tower preclear_disk_Z840TQGD[16639]: Pre-Read: dd output: dd: error reading '/dev/sdr': Input/output error Dec 7 02:00:33 Tower preclear_disk_Z840TQGD[16639]: Pre-Read: dd output: 2950+1 records in Dec 7 02:00:33 Tower preclear_disk_Z840TQGD[16639]: Pre-Read: dd output: 2950+1 records out Dec 7 02:00:34 Tower preclear_disk_Z840TQGD[16639]: Pre-Read: dd output: 6187122688 bytes (6.2 GB, 5.8 GiB) copied, 181.216 s, 34.1 MB/s Dec 7 02:00:35 Tower preclear_disk_Z840TQGD[16639]: Pre-read: pre-read verification failed! Dec 7 02:00:39 Tower preclear_disk_Z840TQGD[16639]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 481: /tmp/.preclear/sdr/dd_output_complete: No such file or directory Dec 7 02:00:40 Tower preclear_disk_Z840TQGD[16639]: /usr/local/emhttp/plugins/preclear.disk/script/preclear_disk.sh: line 483: /tmp/.preclear/sdr/dd_output: No such file or directory tower-diagnostics-20221207-0205 (after starting preclear).zip Edited December 7, 20223 yr by mrbens
December 7, 20223 yr Author The disk just popped back into the Unassigned Devices & Preclear list and is no longer in the Historical Devices list. Preclear had note "Error encountered, please verify the log". I'd tried to run a SMART test after the preclear had began before. Not sure if that's what caused it to fail. After the preclear there were lots of repeated logs like: Dec 7 02:12:20 Tower kernel: ata8: SATA link down (SStatus 0 SControl 300) Dec 7 02:12:14 Tower kernel: ata8: limiting SATA link speed to 1.5 Gbps Then finally: Dec 7 02:12:54 Tower kernel: ata8: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Ran a SMART short self test now which says: Last SMART test result: Completed without error SMART self-test history: No self-tests have been logged. [To run self-tests, use: smartctl -t] Trying preclear again but getting errors again: Dec 7 02:22:40 Tower kernel: print_req_error: 864 callbacks suppressed Dec 7 02:22:40 Tower kernel: blk_update_request: I/O error, dev sdr, sector 66372608 op 0x0:(READ) flags 0x84700 phys_seg 168 prio class 0 Does it look like the disk will need replacing please? Edited December 7, 20223 yr by mrbens
December 7, 20223 yr Author I've replaced the SATA cable with a new one and running Preclear again. 30 minutes in and no errors so far. S.M.A.R.T. Status (device type: default) ATTRIBUTE INITIAL STATUS Reallocated_Sector_Ct 0 - Power_On_Hours 32197 - Runtime_Bad_Block 3 - End-to-End_Error 0 - Reported_Uncorrect 0 - Airflow_Temperature_Cel 25 - ->Failed in Past<- Current_Pending_Sector 0 - Offline_Uncorrectable 0 - UDMA_CRC_Error_Count 37 - SMART overall-health self-assessment test result: PASSED
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.