Sander de Ruiter Posted July 2, 2021 Share Posted July 2, 2021 Hi, Array with new parity disk (8TB) and 2 6TB chucked disks. Yesterday a parity check led to read errors (2048 errors on disk1) and stopped. Unraid offered to do a read-check, which I did. After 2-3 hours, disk2 is now at 400M+ read errors.... Have I lost all data? nas-diagnostics-20210702-1401.zip Quote Link to comment
JorgeB Posted July 2, 2021 Share Posted July 2, 2021 Both disks dropped offline, this is usually a power/connection problem, check/replace cables and post new diags so we can see the SMART reports. Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 It’s still doing a read check. Can I do a shutdown now? Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 I did a reboot, but now that server is not responding and/or coming back up. I also cannot ping the Unraid IP (host is down). Any hints? Quote Link to comment
JorgeB Posted July 2, 2021 Share Posted July 2, 2021 That's a separate issue, do you have an attached monitor/IPMI? is it booting normally? Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 It's running headless, I need to drag it out of it's location to inspect. Back soon. Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 Array is stopped (I did not have auto start on), disk1 is unmounted (device is missing, disabled). Attached is new diags. nas-diagnostics-20210702-1635.zip Quote Link to comment
JorgeB Posted July 2, 2021 Share Posted July 2, 2021 DIsk1 is still generating ATA errors errors and the SMART report is incomplete, did you replace the cables? Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 Apologies, I did not replace the cable yet. However, I've now replace the SATA data cable for disk1, and attached new logs. My casual inspection of the logs leads me to believe the new cable didn't make a difference? nas-diagnostics-20210702-1704.zip Quote Link to comment
JorgeB Posted July 2, 2021 Share Posted July 2, 2021 Looks like a bad disk, but you should have also replaced or swap the power cable, just to make sure, that's why I said cables, not cable [emoji846] Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 Agreed. I've now swapped two power cables, but the end result is the same. My conclusion is that disk1 is bad. Now for the solution: I've purchased an additional 8TB drive, which arrives tomorrow. This will replace the 6TB disk1. What's the appropriate procedure to make this work? Quote Link to comment
JorgeB Posted July 2, 2021 Share Posted July 2, 2021 Start the array, as long as the emulated disk is mounting all you need to do is a standard disk replacement. Quote Link to comment
itimpi Posted July 2, 2021 Share Posted July 2, 2021 The standard process for replacing a failed drive is covered here in the online documentation that can be accessed via the Manual link at the bottom of the Unraid GUI. Quote Link to comment
Sander de Ruiter Posted July 2, 2021 Author Share Posted July 2, 2021 Thanks both! Will report back tomorrow after replacing the disk and performing the steps outlined. Quote Link to comment
Sander de Ruiter Posted July 4, 2021 Author Share Posted July 4, 2021 Hi, parity sync finished after nearly 16 hours. I've attached the latest diags. What worries me is the the new drive (Disk 1ST8000VN004-2M2101_WSD0MT7X - 8 TB (sdc)) is already showing errors and that the 6TB had 588 errors during parity sync. Question 1: should I return the new disk and ask for a replacement (hopefully without errors)? Question 2: do these parity sync errors on 588 have any repercussions on the state of the array? Thanks nas-diagnostics-20210704-0620.zip Quote Link to comment
JorgeB Posted July 4, 2021 Share Posted July 4, 2021 2 hours ago, Sander de Ruiter said: What worries me is the the new drive (Disk 1ST8000VN004-2M2101_WSD0MT7X - 8 TB (sdc)) is already showing errors That disk is already showing a lot of pending sectors, it should be replaced. 2 hours ago, Sander de Ruiter said: Question 2: do these parity sync errors on 588 have any repercussions on the state of the array? Errors on disk2 look more like a power/connection problem, replace cables and rebuild disk1 again to a new disk. Quote Link to comment
Sander de Ruiter Posted July 4, 2021 Author Share Posted July 4, 2021 Alright, 8TB replacement ordered and the current one RMA'd. Will replace power/connection on disk2 and swap disk1 when the new 8TB arrives. Will report back. Quote Link to comment
Sander de Ruiter Posted July 8, 2021 Author Share Posted July 8, 2021 Well, not sure if I'm just unlucky, or something else. Two days ago the replacement 8TB arrived. I've swapped the new one for the faulty one, made sure all connections on the board and drives were sound, and started rebuilding the array. Done after 16 hours, no errors reported. The array has performed without reporting errors for 1 day, and just now I woke up to: Quote Unraid Disk 2 error: 08-07-2021 06:11 Alert [NAS] - Disk 2 in error state (disk dsbl) ST6000DM003-2CY186_ZF2032KA (sde) I've attached the logs again, but I'm really at a loss here. nas-diagnostics-20210708-0756.zip Quote Link to comment
JorgeB Posted July 8, 2021 Share Posted July 8, 2021 This time is was a controller problem, it affected all disks: Jul 8 02:25:02 NAS kernel: ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xccebc000 flags=0x0000] Jul 8 02:25:02 NAS kernel: ahci 0000:01:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xccebc480 flags=0x0000] This is quite common with some Ryzen boards, look for a BIOS update or use an add-on controller. Quote Link to comment
Sander de Ruiter Posted July 8, 2021 Author Share Posted July 8, 2021 Ok thanks. Two questions: 1. disk2 is disabled and contents are emulated. How can I restore disk2 to a normal state (if that is the right thing to do, given it was a controller problem)? 2. Is there a page with links to add-on controllers? I have no clue what to search for for purchase. Quote Link to comment
ChatNoir Posted July 8, 2021 Share Posted July 8, 2021 6 minutes ago, Sander de Ruiter said: Is there a page with links to add-on controllers? I have no clue what to search for for purchase. Yes, there : Quote Link to comment
itimpi Posted July 8, 2021 Share Posted July 8, 2021 38 minutes ago, Sander de Ruiter said: disk2 is disabled and contents are emulated. How can I restore disk2 to a normal state (if that is the right thing to do, given it was a controller problem)? The process is covered here in the online documentation that can be accessed via the Manual link at the bottom of the Unraid GUI. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.