July 2, 20215 yr Hi, Array with new parity disk (8TB) and 2 6TB chucked disks. Yesterday a parity check led to read errors (2048 errors on disk1) and stopped. Unraid offered to do a read-check, which I did. After 2-3 hours, disk2 is now at 400M+ read errors.... Have I lost all data? nas-diagnostics-20210702-1401.zip
July 2, 20215 yr Both disks dropped offline, this is usually a power/connection problem, check/replace cables and post new diags so we can see the SMART reports.
July 2, 20215 yr Author I did a reboot, but now that server is not responding and/or coming back up. I also cannot ping the Unraid IP (host is down). Any hints?
July 2, 20215 yr That's a separate issue, do you have an attached monitor/IPMI? is it booting normally?
July 2, 20215 yr Author It's running headless, I need to drag it out of it's location to inspect. Back soon.
July 2, 20215 yr Author Array is stopped (I did not have auto start on), disk1 is unmounted (device is missing, disabled). Attached is new diags. nas-diagnostics-20210702-1635.zip
July 2, 20215 yr DIsk1 is still generating ATA errors errors and the SMART report is incomplete, did you replace the cables?
July 2, 20215 yr Author Apologies, I did not replace the cable yet. However, I've now replace the SATA data cable for disk1, and attached new logs. My casual inspection of the logs leads me to believe the new cable didn't make a difference? nas-diagnostics-20210702-1704.zip
July 2, 20215 yr Looks like a bad disk, but you should have also replaced or swap the power cable, just to make sure, that's why I said cables, not cable [emoji846]
July 2, 20215 yr Author Agreed. I've now swapped two power cables, but the end result is the same. My conclusion is that disk1 is bad. Now for the solution: I've purchased an additional 8TB drive, which arrives tomorrow. This will replace the 6TB disk1. What's the appropriate procedure to make this work?
July 2, 20215 yr Start the array, as long as the emulated disk is mounting all you need to do is a standard disk replacement.
July 2, 20215 yr The standard process for replacing a failed drive is covered here in the online documentation that can be accessed via the Manual link at the bottom of the Unraid GUI.
July 2, 20215 yr Author Thanks both! Will report back tomorrow after replacing the disk and performing the steps outlined.
July 4, 20215 yr Author Hi, parity sync finished after nearly 16 hours. I've attached the latest diags. What worries me is the the new drive (Disk 1ST8000VN004-2M2101_WSD0MT7X - 8 TB (sdc)) is already showing errors and that the 6TB had 588 errors during parity sync. Question 1: should I return the new disk and ask for a replacement (hopefully without errors)? Question 2: do these parity sync errors on 588 have any repercussions on the state of the array? Thanks nas-diagnostics-20210704-0620.zip
July 4, 20215 yr 2 hours ago, Sander de Ruiter said: What worries me is the the new drive (Disk 1ST8000VN004-2M2101_WSD0MT7X - 8 TB (sdc)) is already showing errors That disk is already showing a lot of pending sectors, it should be replaced. 2 hours ago, Sander de Ruiter said: Question 2: do these parity sync errors on 588 have any repercussions on the state of the array? Errors on disk2 look more like a power/connection problem, replace cables and rebuild disk1 again to a new disk.
July 4, 20215 yr Author Alright, 8TB replacement ordered and the current one RMA'd. Will replace power/connection on disk2 and swap disk1 when the new 8TB arrives. Will report back.
July 8, 20214 yr Author Well, not sure if I'm just unlucky, or something else. Two days ago the replacement 8TB arrived. I've swapped the new one for the faulty one, made sure all connections on the board and drives were sound, and started rebuilding the array. Done after 16 hours, no errors reported. The array has performed without reporting errors for 1 day, and just now I woke up to: Quote Unraid Disk 2 error: 08-07-2021 06:11 Alert [NAS] - Disk 2 in error state (disk dsbl) ST6000DM003-2CY186_ZF2032KA (sde) I've attached the logs again, but I'm really at a loss here. nas-diagnostics-20210708-0756.zip
July 8, 20214 yr This time is was a controller problem, it affected all disks: Jul 8 02:25:02 NAS kernel: ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xccebc000 flags=0x0000] Jul 8 02:25:02 NAS kernel: ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xccebc480 flags=0x0000] This is quite common with some Ryzen boards, look for a BIOS update or use an add-on controller.
July 8, 20214 yr Author Ok thanks. Two questions: 1. disk2 is disabled and contents are emulated. How can I restore disk2 to a normal state (if that is the right thing to do, given it was a controller problem)? 2. Is there a page with links to add-on controllers? I have no clue what to search for for purchase.
July 8, 20214 yr 6 minutes ago, Sander de Ruiter said: Is there a page with links to add-on controllers? I have no clue what to search for for purchase. Yes, there :
July 8, 20214 yr 38 minutes ago, Sander de Ruiter said: disk2 is disabled and contents are emulated. How can I restore disk2 to a normal state (if that is the right thing to do, given it was a controller problem)? The process is covered here in the online documentation that can be accessed via the Manual link at the bottom of the Unraid GUI.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.