May 28, 20206 yr Actually 3 disks! First it started with Parity 1 with (UDMA CRC error count) I changed the SATA cable and connected it to another SATA port in the motherboard, and started rebuilding. While rebuilding, Disk 2 got disabled with +2000 IO errors. Rebuilding was not interrupted, I think because I have 2 Parities. Rebuilding completed, Parity 1 disk became green again. Disk 2 has red ball. After few hours, Disk 3 got disabled. Disk 2 - ST8000DM004-2CX188_WCT0M8D4 (sdm) (errors 2) Disk 3 - ST8000DM004-2CX188_WG8011AG (sdl) (errors 2048) Parity 1 is connected like this: motherborad -> 5.25 to 3.5 converter -> HDD Disk 2 and Disk 3 are connected to LSI SAS HBA through the same port. What is the best way to proceed? I believe losing another disk would cause data lose. I have a spare pre-cleared disk but it has (60 Reported uncorrect), not sure how it got pre-cleared without issues. Please let me know if need more info. I know I did a mistake by shutting down the server before getting the logs, I wanted to check the connections and was too scared to lose another disk. Thanks. Unraid Version: 6.7.2 CPU: AMD Ryzen 7 2700X Mobo: MSI X470 GAMING PLUS (MS-7B79) Memory: 16 GiB DDR4 tower-diagnostics-20200528-1303.zip Edited May 28, 20206 yr by HAMANY
May 28, 20206 yr Community Expert You rebooted since the errors so we can't see what happened, for now I would recommend unassigning both disabled disks and starting the array, check that both emulated disk mount correctly and contents look OK.
May 28, 20206 yr Author 31 minutes ago, johnnie.black said: You rebooted since the errors so we can't see what happened, for now I would recommend unassigning both disabled disks and starting the array, check that both emulated disk mount correctly and contents look OK. Thanks for the quick response. Will try this now. Edit: Done, the files looks fine. Is there anyway to prevent the logs from being removed with rebooting, and keep them for few days? Edited May 28, 20206 yr by HAMANY
May 28, 20206 yr Community Expert 23 minutes ago, HAMANY said: Is there anyway to prevent the logs from being removed with rebooting, and keep them for few days? You can set up a syslog server, though that's used more for troubleshooting when the server keeps crashing or similar, for this kind of problem you just need to download the diagnostics before rebooting.
May 28, 20206 yr Author Just now, johnnie.black said: You can set up a syslog server, though that's used more for troubleshooting when the server keeps crashing or similar, for this kind of problem you just need to download the diagnostics before rebooting. Noted. The server is up and my files looks fine. Should I assign the 2 disks again and start rebuilding?
June 1, 20206 yr Author On 5/28/2020 at 2:56 PM, johnnie.black said: Yes, and if any more issues post new diags (before rebooting) Disk 2 got disabled again just after the parity check started. Diags attached. I've stopped the parity check. tower-diagnostics-20200601-1107.zip
June 1, 20206 yr Community Expert That looks more like a power/connection issue, swap/replace cables and try again.
July 6, 20205 yr Author On 6/1/2020 at 1:18 PM, johnnie.black said: That looks more like a power/connection issue, swap/replace cables and try again. Thank you Johnnie! The drives are now more stable after changing one of the power cables (4pins to SATA). However, I''m getting read and SMART errors in one of the 8TB Seagate drives (Disk 10 - sdi) I did a full scan using SeaTools but it didn't report any errors! Also it passes the SMART extended self-test. Could it be a connection issue also? The diagnostics file is attached. Appreciate your support. Thank you. tower-diagnostics-20200706-0948.zip Edited July 6, 20205 yr by HAMANY
July 6, 20205 yr Community Expert This does not look too good: Quote 5 Reallocated_Sector_Ct PO--CK 099 099 010 - 2352 While reallocated sectors are not necessarily a problem as long as the numbers stays constant anything other that a small number is often a good indication that the drive's health may be suspect. You might want to run an extended SMART test on the drive to see how that goes.
July 6, 20205 yr Community Expert 2 hours ago, HAMANY said: Also it passes the SMART extended self-test. It passed the short test, you should run a long one, but if Seatools passed it should also pass, it did fail a long test before, so there were issues before.
July 6, 20205 yr Author 4 hours ago, itimpi said: This does not look too good: While reallocated sectors are not necessarily a problem as long as the numbers stays constant anything other that a small number is often a good indication that the drive's health may be suspect. You might want to run an extended SMART test on the drive to see how that goes. 3 hours ago, johnnie.black said: It passed the short test, you should run a long one, but if Seatools passed it should also pass, it did fail a long test before, so there were issues before. Thank you both for your reponse. I will re-run the extended SMART and SeaTools again just to make sure. The drive is still under warranty, but I can't create an RMA without the SeaTools log showing the problem.
July 6, 20205 yr Community Expert 1 hour ago, HAMANY said: The drive is still under warranty, but I can't create an RMA without the SeaTools log showing the problem With that many reallocated sectors I would expect them to accept a RMA regardless of the result of the Extended SMART test.
July 10, 20205 yr Author SMART extended self-test has completed without errors! I'm sure SeaTools would give me the same results same as before. Very weird. Logs attached. Appreciate your suggestions. @johnnie.black @itimpi tower-smart-20200711-0246.zip tower-diagnostics-20200711-0250.zip
July 11, 20205 yr For ZCT1BN14 disk, a weird thing happen, SMART clear previous error. I haven't look deep in previous diagnostics. But you got trouble on 3 Seagate disk. Many years ago, 3 of 4 ( same lot ) Seagate 3TB disk got problem less in 2 yrs, after that, I never buy Seagate again, mainly because other brand have good quality and low price source. Error 12 [11] log entry is empty Error 11 [10] log entry is empty Error 10 [9] log entry is empty Error 9 [8] log entry is empty Error 8 [7] log entry is empty Error 7 [6] log entry is empty Error 6 [5] log entry is empty Error 5 [4] log entry is empty Edited July 11, 20205 yr by Benson
July 11, 20205 yr Community Expert 6 hours ago, HAMANY said: SMART extended self-test has completed without errors! That means disk is fine for now, just keep an eye on it.
Archived
This topic is now archived and is closed to further replies.