lostwebb Posted September 17, 2021 Share Posted September 17, 2021 (edited) Hi there, thanks in advance for the help. I had a disk errored for a few weeks. The contents were emulated and I had been using the array.I had to reboot the server after changing some network settings and when it booted back up it had reset the drive assignments. Now any disk assignments I made show up as a new config and I only have the "parity is valid option". I don't remember which disk was disabled and the logs folder is empty. Is there anything I can do at this point to get the array back to its contents emulated state? I know its a bit of a mess but I appreciate the help. Thank you Edited September 19, 2021 by lostwebb Added solved tag. Quote Link to comment
JorgeB Posted September 18, 2021 Share Posted September 18, 2021 There's a way, do you know all the disk assignments, single or dual parity? 1 Quote Link to comment
lostwebb Posted September 18, 2021 Author Share Posted September 18, 2021 14 minutes ago, JorgeB said: There's a way, do you know all the disk assignments, single or dual parity? Yes I know the disk assignments, but I don't recall the disk that was disabled. Ive looked through as all the logs I could find and cant find any sign of which one it was. It is single parity. Thanks Jorge! Quote Link to comment
JorgeB Posted September 18, 2021 Share Posted September 18, 2021 1 hour ago, lostwebb said: but I don't recall the disk that was disabled. That's pretty important, post current diagnostics, if a disk is clearly failing it should be visible on SMART. 1 Quote Link to comment
lostwebb Posted September 18, 2021 Author Share Posted September 18, 2021 12 hours ago, JorgeB said: That's pretty important, post current diagnostics, if a disk is clearly failing it should be visible on SMART. I have just realised it says in the DISK_ASSIGNMENTS.txt that I get User Scripts to create once a week. I believe it was disk 12 that was disabled. Not sure if you can collaborate that with the diags? But I believe it to be accurate as it was a higher number disk I remember being disabled and the user scripts disk list was created 5 days ago. So it looks like I need to get Unraid to believe that disk 12 is disabled again and emulate its contents from parity. I will buy a replacement disk and a second parity right away to stop this from happening again. Thank you for your help. tower-diagnostics-20210918-2212.zip Quote Link to comment
JorgeB Posted September 19, 2021 Share Posted September 19, 2021 According to that yes, it was disk12, disk itself looks OK, just a lot of CRC errors which suggest a cable problem, if you didn't write anything to the emulated disk once disk12 got disable you could just re-sync parity (after replacing SATA cables on disk12) but if you want to go back to how it was you can do this: -Since you've already re-assigned all the disks, so just make sure all assignments are correct. -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk12 -Start array (in normal mode now), ideally the emulated disk will now mount and contents look correct, if it doesn't you should run a filesystem check on the emulated disk -If the emulated disk mounts and contents look correct you can then rebuild to a new disk or the old one. Quote Link to comment
lostwebb Posted September 19, 2021 Author Share Posted September 19, 2021 7 hours ago, JorgeB said: According to that yes, it was disk12, disk itself looks OK, just a lot of CRC errors which suggest a cable problem, if you didn't write anything to the emulated disk once disk12 got disable you could just re-sync parity (after replacing SATA cables on disk12) but if you want to go back to how it was you can do this: -Since you've already re-assigned all the disks, so just make sure all assignments are correct. -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk12 -Start array (in normal mode now), ideally the emulated disk will now mount and contents look correct, if it doesn't you should run a filesystem check on the emulated disk -If the emulated disk mounts and contents look correct you can then rebuild to a new disk or the old one. Thank you very much, it has worked perfectly! I did exactly what you said and it came back first time. Thank you!!! 1 Quote Link to comment
lostwebb Posted December 30, 2021 Author Share Posted December 30, 2021 (edited) On 9/19/2021 at 8:48 AM, JorgeB said: According to that yes, it was disk12, disk itself looks OK, just a lot of CRC errors which suggest a cable problem, if you didn't write anything to the emulated disk once disk12 got disable you could just re-sync parity (after replacing SATA cables on disk12) but if you want to go back to how it was you can do this: -Since you've already re-assigned all the disks, so just make sure all assignments are correct. -IMPORTANT - Check both "parity is already valid" and "maintenance mode" and start the array (note that the GUI will still show that data on parity disk(s) will be overwritten, this is normal as it doesn't account for the checkbox, but it won't be as long as it's checked) -Stop array -Unassign disk12 -Start array (in normal mode now), ideally the emulated disk will now mount and contents look correct, if it doesn't you should run a filesystem check on the emulated disk -If the emulated disk mounts and contents look correct you can then rebuild to a new disk or the old one. Hi again I have just had the same happen again, had to wait for a disk to arrive, wrote contents to the emulated disks. Since the original issue I have added a second parity. I rebooted, lost the disk assignments, assigned all disks back, including the new disk to the missing slot, stated in "maintenance mode" with "parity is valid", stopped the array, unassigned the disk and started in normal mode and am getting unmountable error and I am unable to start a disk check in maintenance mode. Does this same fix work with dual parity or is there some other issue? Thanks again for the help. tower-diagnostics-20211230-2349.zip Edited December 30, 2021 by lostwebb Added diags Quote Link to comment
JorgeB Posted December 31, 2021 Share Posted December 31, 2021 Diags after array start in normal mode please. Quote Link to comment
lostwebb Posted December 31, 2021 Author Share Posted December 31, 2021 3 hours ago, JorgeB said: Diags after array start in normal mode please. I triple checked all the assignments are correct. It's throwing disk errors on 3 disks now too. thank you tower-diagnostics-20211231-1142.zip Quote Link to comment
JorgeB Posted December 31, 2021 Share Posted December 31, 2021 There are read errors in multiple disks: Dec 31 11:41:40 Tower kernel: md: disk1 read error, sector=5860527944 .. Dec 31 11:41:40 Tower kernel: md: disk6 read error, sector=5860527944 .. Dec 31 11:41:40 Tower kernel: md: disk2 read error, sector=3911257848 Need to fix that first, looks like a connection/power problem. Quote Link to comment
lostwebb Posted December 31, 2021 Author Share Posted December 31, 2021 4 hours ago, JorgeB said: There are read errors in multiple disks: Dec 31 11:41:40 Tower kernel: md: disk1 read error, sector=5860527944 .. Dec 31 11:41:40 Tower kernel: md: disk6 read error, sector=5860527944 .. Dec 31 11:41:40 Tower kernel: md: disk2 read error, sector=3911257848 Need to fix that first, looks like a connection/power problem. Thanks tower-diagnostics-20211231-1739.zip Quote Link to comment
Solution JorgeB Posted January 1, 2022 Solution Share Posted January 1, 2022 Dec 31 17:38:45 Tower kernel: attempt to access beyond end of device Dec 31 17:38:45 Tower kernel: md12: rw=4096, want=7814037064, limit=7814034952 Dec 31 17:38:45 Tower kernel: XFS (md12): last sector read failed This suggests you used a slightly smaller 4TB disk in this position, maybe this one: Model Family: Western Digital Red Device Model: WDC WD40EFRX-68WT0N0 Serial Number: WD-WCC4E1826444 User Capacity: 4,000,785,948,160 bytes [4.00 TB] Notice that it's slightly smaller than the other ones: User Capacity: 4,000,787,030,016 bytes [4.00 TB] Maybe a shucked disk? 1 Quote Link to comment
lostwebb Posted January 1, 2022 Author Share Posted January 1, 2022 (edited) 6 hours ago, JorgeB said: Dec 31 17:38:45 Tower kernel: attempt to access beyond end of device Dec 31 17:38:45 Tower kernel: md12: rw=4096, want=7814037064, limit=7814034952 Dec 31 17:38:45 Tower kernel: XFS (md12): last sector read failed This suggests you used a slightly smaller 4TB disk in this position, maybe this one: Model Family: Western Digital Red Device Model: WDC WD40EFRX-68WT0N0 Serial Number: WD-WCC4E1826444 User Capacity: 4,000,785,948,160 bytes [4.00 TB] Notice that it's slightly smaller than the other ones: User Capacity: 4,000,787,030,016 bytes [4.00 TB] Maybe a shucked disk? Ahh OK yes I just bought it off eBay so its probably shucked. I have a new one coming next week I'll replace it with that. How can I get the array back up and running in the mean time? Do I need to add the failing disk back just to assign it and then un assign it? Thanks and happy new year I did the process with the failed disk and it has come back online fine. I have a new disk coming so I will replace the failed one with that and use the shucked one as a new drive in my backup Unraid. Thank you so much for the help as usual. Edited January 1, 2022 by lostwebb Resolved Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.