Kismet Posted December 29, 2017 Share Posted December 29, 2017 I just completed a successful parity copy from my old disk to the new one. Then I shutdown the server and replaced the former parity disk with a new drive to do a data restore. However when I did that the array was looking for the old parity drive in the array and considered the new parity drive invalid (still recognizes the drive as a valid XFS disk with data, just not at as the parity disk). The only reason I can see for this is that it seems like I have to put the old parity drive in, start the array (assuming I don't have to do another copy), let the system do a data rebuild on the former parity disk, then remove the parity disk and put in the new drive to do a new data rebuild. Is that really the case or is there some other reason my new parity disk wouldn't be recognized? Edit: So I went in and set everything back to as it was when I completed the parity copy, and now Unraid still won't recognize my new disk as a partiy disk and wants me to do another 9 hour copy all over again. Second edit: This is an edit born of frustration because it really looks like the system completed the copy and the array, as listed in the UI, was showing the properly configured array (post copy) but Unraid didn't actually save the new configuration while leaving me with every indication that it had done so. And now because the configuration on the USB stick wasn't updated I've got the redo the whole process over just so I can force the UI to save the new configuration by hitting the start array button, despite no indication that would be necessary. Hypothetically I could just rebuild the array but if I do that then I lose the parity information and because I'm replacing a data disk that went bad, I'd lose the ability to restore it if I rebuild the array. Link to comment
JorgeB Posted December 29, 2017 Share Posted December 29, 2017 If you havent's reboot yet post your diagnostics: Tools -> Diagnostics Link to comment
Kismet Posted December 29, 2017 Author Share Posted December 29, 2017 1 minute ago, johnnie.black said: If you havent's reboot yet post your diagnostics: Tools -> Diagnostics I have rebooted, although I can still give you the diagnostics if you'd like, just tell me what configuration of drives you'd like me to show. I don't think they'll tell you anything though, I can read and mount all the drives just fine, and they show up in the Unraid GUI without any problem. It's just after the reboot the array wants to see the old drive as the parity disk and not the new one. Link to comment
JorgeB Posted December 29, 2017 Share Posted December 29, 2017 The diagnostics that includes the syslog showing the parity swap. Link to comment
Kismet Posted December 29, 2017 Author Share Posted December 29, 2017 22 minutes ago, johnnie.black said: The diagnostics that includes the syslog showing the parity swap. Unless there is a archive somewhere I can't find, I don't have those. The diagnostics tool only gives me the syslog records back to the last startup. Edit: Since the day is over for me I'm just going to kick off a copy again and see what happens in the morning. polaris-diagnostics-20171228-2353.zip Link to comment
JorgeB Posted December 29, 2017 Share Posted December 29, 2017 If there are issues again grab the diagnostics before rebooting. Link to comment
JorgeB Posted December 29, 2017 Share Posted December 29, 2017 The 3TB WD which I assume is your old parity has double digit raw read errors, those are never a good sign on WD drives especially when above single digits. Link to comment
Kismet Posted December 29, 2017 Author Share Posted December 29, 2017 9 hours ago, johnnie.black said: The 3TB WD which I assume is your old parity has double digit raw read errors, those are never a good sign on WD drives especially when above single digits. That's why I rebooted to install a new drive instead of doing a data rebuild on the old parity drive. So the second copy finished a few minutes ago. I pulled a diagnostic package and then, because I'm a glutton for punishment and I wanted to check and see if I was right, I started the array in maintenance mode then shutdown, inserted the new disk, and everything is working as expected. The new parity disk was recognized as my array's parity disk and the system is currently doing a data rebuild on the new data disk. Which is itself extremely frustrating, a little warning or notice that the UI is lying to you and you have to start the array to finalize the configuration would have been nice. Even when you start the copy event the text describe what you're doing just says you "may" start the array afterwards like it doesn't matter if you do or not. I know all it cost me was a little bit of wear and an extra 9 hours, but the level of frustration and amount of expletives I want to actually put in this post over a lack of such a simple and basic UI tweak is seriously making me consider whether I want to continue using Unraid. polaris-diagnostics-20171229-0952.zip Link to comment
JorgeB Posted December 29, 2017 Share Posted December 29, 2017 Yeah, that "may" is probably not the best wording, but after the copy finishes you only get the option to start the array to begin the rebuild, first time I recall someone having problems with the procedure description. Link to comment
Kismet Posted December 30, 2017 Author Share Posted December 30, 2017 7 hours ago, johnnie.black said: Yeah, that "may" is probably not the best wording, but after the copy finishes you only get the option to start the array to begin the rebuild, first time I recall someone having problems with the procedure description. So now I have a whole new problem. The data rebuild on my new drive just completed so I stopped the array so I could power down the system and replace the next drive since I'm replacing them all. Except when I powered down the array the UI now shows the array wants to see the old data drive instead of the new one which just got rebuilt and appears as if it won't let me start the array without kicking off another rebuild. polaris-diagnostics-20171229-1832.zip Edit: Glorious, just glorious. I figured I should be able to just restart the array and do the data rebuild. After all, the data rebuild completed successfully according to the UI and syslog, therefore it should be able to quickly validate that the data is there and not need to copy it all over again. No, no check at all, it's redoing the whole write all over again. At least when I was doing the parity swap you could say I executed the procedure from the manual incorrectly because I didn't complete the data rebuild on the old parity drive which is in the process. Here I followed everything step by step with no errors anywhere and I'm still having to repeat the process and hope it just works the second time. See you another 8 hours. Link to comment
JorgeB Posted December 30, 2017 Share Posted December 30, 2017 That's not normal at all, you may have a problem with your flash-drive, but we'd need to see the syslog before rebooting to see what happened. Link to comment
Kismet Posted December 30, 2017 Author Share Posted December 30, 2017 7 hours ago, johnnie.black said: That's not normal at all, you may have a problem with your flash-drive, but we'd need to see the syslog before rebooting to see what happened. The diagnostics I included with the screenshot was taken after the write finished and I stopped the array, no rebooting. After that I started the array and let it run, and here is the diagnostics from after I got back to the server and the second data rebuild was complete. After this report (polaris-diagnostics-20171230-0719) I stopped the array and now the UI is telling me everything is normal, no unexpected disk. If it's the USB then great, finally something I can pinpoint, but this time I did absolutely nothing different and haven't rebooted since before the first rebuild and somehow it's just working after the second rebuild. polaris-diagnostics-20171230-0719.zip Link to comment
JorgeB Posted December 30, 2017 Share Posted December 30, 2017 46 minutes ago, Kismet said: The diagnostics I included with the screenshot was taken after the write finished and I stopped the array, no rebooting Yes sorry about that, must have seen those before my coffee, something weird going on here, not really sure why as I never seen this before, right after boot disk1 is seen as not assigned instead of empty: Dec 29 09:58:38 Polaris kernel: mdcmd (1): import 0 sdc 3907018532 0 WDC_WD40EFRX-68N32N0_WD-WCC7K4YJ60YJ Dec 29 09:58:38 Polaris kernel: md: import disk0: (sdc) WDC_WD40EFRX-68N32N0_WD-WCC7K4YJ60YJ size: 3907018532 Dec 29 09:58:38 Polaris kernel: mdcmd (2): import 1 Dec 29 09:58:38 Polaris kernel: mdcmd (3): import 2 sdf 1953514552 0 WDC_WD20EARX-00PASB0_WD-WMAZA7604175 Dec 29 09:58:38 Polaris kernel: md: import disk2: (sdf) WDC_WD20EARX-00PASB0_WD-WMAZA7604175 size: 1953514552 Dec 29 09:58:38 Polaris kernel: mdcmd (4): import 3 sdd 1953514552 0 WDC_WD20EARX-00PASB0_WD-WCAZAD791516 Dec 29 09:58:38 Polaris kernel: md: import disk3: (sdd) WDC_WD20EARX-00PASB0_WD-WCAZAD791516 size: 1953514552 It should look something like this: Quote Dec 29 09:58:38 Polaris kernel: mdcmd (1): import 0 sdc 3907018532 0 WDC_WD40EFRX-68N32N0_WD-WCC7K4YJ60YJ Dec 29 09:58:38 Polaris kernel: md: import disk0: (sdc) WDC_WD40EFRX-68N32N0_WD-WCC7K4YJ60YJ size: 3907018532 Dec 29 09:58:38 Polaris kernel: mdcmd (2): import 1 Dec 29 09:58:38 Polaris kernel: md: import_slot: 1 empty Dec 29 09:58:38 Polaris kernel: mdcmd (3): import 2 sdf 1953514552 0 WDC_WD20EARX-00PASB0_WD-WMAZA7604175 Dec 29 09:58:38 Polaris kernel: md: import disk2: (sdf) WDC_WD20EARX-00PASB0_WD-WMAZA7604175 size: 1953514552 Dec 29 09:58:38 Polaris kernel: mdcmd (4): import 3 sdd 1953514552 0 WDC_WD20EARX-00PASB0_WD-WCAZAD791516 Dec 29 09:58:38 Polaris kernel: md: import disk3: (sdd) WDC_WD20EARX-00PASB0_WD-WCAZAD791516 size: 1953514552 When you assign the new disk1 it's also not normal: Quote Dec 29 09:59:14 Polaris kernel: mdcmd (2): import 1 sde 3907018532 0 WDC_WD40EFRX-68N32N0_WD-WCC7K2YF4RV9 Dec 29 09:59:14 Polaris kernel: md: import disk1: (sde) WDC_WD40EFRX-68N32N0_WD-WCC7K2YF4RV9 size: 3907018532 Dec 29 09:59:14 Polaris kernel: mdcmd (3): import 2 sdf 1953514552 0 WDC_WD20EARX-00PASB0_WD-WMAZA7604175 Dec 29 09:59:14 Polaris kernel: md: import disk2: (sdf) WDC_WD20EARX-00PASB0_WD-WMAZA7604175 size: 1953514552 It should be like this: Quote Dec 29 09:59:14 Polaris kernel: mdcmd (2): import 1 sde 3907018532 0 WDC_WD40EFRX-68N32N0_WD-WCC7K2YF4RV9 Dec 29 09:59:14 Polaris kernel: md: import disk1: (sde) WDC_WD40EFRX-68N32N0_WD-WCC7K2YF4RV9 size: 3907018532 Dec 29 09:59:14 Polaris kernel: md: import_slot: 1 replaced Dec 29 09:59:14 Polaris kernel: mdcmd (3): import 2 sdf 1953514552 0 WDC_WD20EARX-00PASB0_WD-WMAZA7604175 Dec 29 09:59:14 Polaris kernel: md: import disk2: (sdf) WDC_WD20EARX-00PASB0_WD-WMAZA7604175 size: 1953514552 And when you start the array there's a mention of a rebuild, but also a parity check: Dec 29 10:00:41 Polaris kernel: mdcmd (40): check correct Dec 29 10:00:41 Polaris kernel: md: recovery thread: recon D1 .. Only the recon should appear, and when the rebuild finishes it's complaining that the disk is wrong only because of the size, i.e., like it didn't expand the filesystem from 3 to 4TB. Like I said I never seen this before, so no idea on what's going on, something wrong with the flash drive is a possibility, though I don't see any errors about that, if you have a windows desktop/laptop try running a chkdsk on it, or whatever the equivalent on a mac. P.S.: unrelated to your issues but there are also some ATA errors, probably cable/connection related, ATA2 and 4 are both 4TB disks: Dec 29 11:57:32 Polaris kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x480100 action 0x6 frozen Dec 29 11:57:32 Polaris kernel: ata4.00: irq_stat 0x08000000, interface fatal error Dec 29 11:57:32 Polaris kernel: ata4: SError: { UnrecovData 10B8B Handshk } Dec 29 11:57:32 Polaris kernel: ata4.00: failed command: WRITE DMA EXT Dec 29 11:57:32 Polaris kernel: ata4.00: cmd 35/00:40:a0:35:da/00:05:57:00:00/e0 tag 8 dma 688128 out Dec 29 11:57:32 Polaris kernel: res 50/00:00:a0:35:da/00:00:57:00:00/e0 Emask 0x10 (ATA bus error) Dec 29 11:57:32 Polaris kernel: ata4.00: status: { DRDY } Dec 29 11:57:32 Polaris kernel: ata4: hard resetting link Dec 29 11:57:32 Polaris kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:32 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:32 Polaris kernel: ata4.00: configured for UDMA/133 Dec 29 11:57:32 Polaris kernel: ata4: EH complete Dec 29 11:57:34 Polaris kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x480100 action 0x6 frozen Dec 29 11:57:34 Polaris kernel: ata4.00: irq_stat 0x08000000, interface fatal error Dec 29 11:57:34 Polaris kernel: ata4: SError: { UnrecovData 10B8B Handshk } Dec 29 11:57:34 Polaris kernel: ata4.00: failed command: WRITE DMA EXT Dec 29 11:57:34 Polaris kernel: ata4.00: cmd 35/00:40:60:2a:e0/00:05:57:00:00/e0 tag 27 dma 688128 out Dec 29 11:57:34 Polaris kernel: res 50/00:00:60:2a:e0/00:00:57:00:00/e0 Emask 0x10 (ATA bus error) Dec 29 11:57:34 Polaris kernel: ata4.00: status: { DRDY } Dec 29 11:57:34 Polaris kernel: ata4: hard resetting link Dec 29 11:57:35 Polaris kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:35 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:35 Polaris kernel: ata4.00: configured for UDMA/133 Dec 29 11:57:35 Polaris kernel: ata4: EH complete Dec 29 11:57:51 Polaris kernel: ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen Dec 29 11:57:51 Polaris kernel: ata2.00: irq_stat 0x08000000, interface fatal error Dec 29 11:57:51 Polaris kernel: ata2: SError: { UnrecovData 10B8B BadCRC } Dec 29 11:57:51 Polaris kernel: ata2.00: failed command: READ DMA EXT Dec 29 11:57:51 Polaris kernel: ata2.00: cmd 25/00:40:a0:4c:11/00:05:58:00:00/e0 tag 6 dma 688128 in Dec 29 11:57:51 Polaris kernel: res 50/00:00:a0:4c:11/00:00:58:00:00/e0 Emask 0x10 (ATA bus error) Dec 29 11:57:51 Polaris kernel: ata2.00: status: { DRDY } Dec 29 11:57:51 Polaris kernel: ata2: hard resetting link Dec 29 11:57:51 Polaris kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 11:57:51 Polaris kernel: ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 11:57:51 Polaris kernel: ata2.00: configured for UDMA/133 Dec 29 11:57:51 Polaris kernel: ata2: EH complete Dec 29 12:47:35 Polaris kernel: ata4.00: exception Emask 0x10 SAct 0x0 SErr 0x480100 action 0x6 frozen Dec 29 12:47:35 Polaris kernel: ata4.00: irq_stat 0x08000000, interface fatal error Dec 29 12:47:35 Polaris kernel: ata4: SError: { UnrecovData 10B8B Handshk } Dec 29 12:47:35 Polaris kernel: ata4.00: failed command: WRITE DMA EXT Dec 29 12:47:35 Polaris kernel: ata4.00: cmd 35/00:40:58:c2:3d/00:05:7a:00:00/e0 tag 15 dma 688128 out Dec 29 12:47:35 Polaris kernel: res 50/00:00:58:c2:3d/00:00:7a:00:00/e0 Emask 0x10 (ATA bus error) Dec 29 12:47:35 Polaris kernel: ata4.00: status: { DRDY } Dec 29 12:47:35 Polaris kernel: ata4: hard resetting link Dec 29 12:47:35 Polaris kernel: ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out Dec 29 12:47:35 Polaris kernel: ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out Dec 29 12:47:35 Polaris kernel: ata4.00: configured for UDMA/133 Dec 29 12:47:35 Polaris kernel: ata4: EH complete Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.