JorgeB Posted December 6, 2020 Share Posted December 6, 2020 Since the disk looks healthy and the errors don't look disk related instead of rebuilding you can do a new config and re-sync parity instead, but make sure the actual disk is amounting correctly first, you can do that with UD (array must be stopped). Quote Link to comment
abhi.ko Posted December 6, 2020 Author Share Posted December 6, 2020 (edited) So ignore the lost and found and do the following, just for clarity: 1. Stop the array 2. Unassign disk 15 3. Start the array 4. Stop it again 5. Assign this disk to disk 15 again 6. Start the array 7. Resync parity Are these the right steps @JorgeB? Edited December 6, 2020 by abhi.ko Quote Link to comment
JorgeB Posted December 7, 2020 Share Posted December 7, 2020 That won't re-sync parity, that will rebuild the disable disk on top, and probably not what yo want in this case. 20 hours ago, abhi.ko said: 1. Stop the array 2. Unassign disk 15 3. Start the array 4. Stop it again This is what you need to do to see if the actual disks mounts with UD and contents look correct, post back after that. 1 Quote Link to comment
abhi.ko Posted December 7, 2020 Author Share Posted December 7, 2020 10 hours ago, JorgeB said: That won't re-sync parity, that will rebuild the disable disk on top, and probably not what yo want in this case. This is what you need to do to see if the actual disks mounts with UD and contents look correct, post back after that. Thanks! Did that and it looks good after un-assigning from array and mounting in UD. Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] 15628053168 512-byte logical blocks: (8.00 TB/7.28 TiB) Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] 4096-byte physical blocks Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Write Protect is off Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Mode Sense: 9b 00 10 08 Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Write cache: enabled, read cache: enabled, supports DPO and FUA Dec 7 11:31:09 Tower kernel: sdq: sdq1 Dec 7 11:31:09 Tower kernel: sd 10:0:2:0: [sdq] Attached SCSI disk Dec 7 11:31:32 Tower emhttpd: ST8000VN004-2M2101_WKD08RSZ (sdq) 512 15628053168 Dec 7 11:31:33 Tower kernel: mdcmd (16): import 15 sdq 64 7814026532 0 ST8000VN004-2M2101_WKD08RSZ Dec 7 11:31:33 Tower kernel: md: import disk15: (sdq) ST8000VN004-2M2101_WKD08RSZ size: 7814026532 Dec 7 11:31:38 Tower emhttpd: shcmd (53): /usr/local/sbin/set_ncq sdq 1 Dec 7 11:31:38 Tower root: set_ncq: setting sdq queue_depth to 1 Dec 7 11:31:38 Tower emhttpd: shcmd (54): echo 128 > /sys/block/sdq/queue/nr_requests Dec 7 12:57:25 Tower emhttpd: ST8000VN004-2M2101_WKD08RSZ (sdq) 512 15628053168 Dec 7 12:58:22 Tower unassigned.devices: Issue spin down timer for device '/dev/sdq'. Dec 7 12:59:38 Tower unassigned.devices: Adding disk '/dev/sdq1'... Dec 7 12:59:38 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdq1' '/mnt/disks/ST8000VN004-2M2101_WKD08RSZ' Dec 7 12:59:38 Tower kernel: XFS (sdq1): Mounting V5 Filesystem Dec 7 12:59:38 Tower kernel: XFS (sdq1): Starting recovery (logdev: internal) Dec 7 12:59:38 Tower kernel: XFS (sdq1): Ending recovery (logdev: internal) Dec 7 12:59:38 Tower unassigned.devices: Successfully mounted '/dev/sdq1' on '/mnt/disks/ST8000VN004-2M2101_WKD08RSZ'. Dec 7 12:59:38 Tower unassigned.devices: Issue spin down timer for device '/dev/sdq'. The content looks okay as well, screenshot below. I have no ways of verifying whether this is what was in Disk 15 before the FS corruption, but it looks good to me from a cursory look. Diagnostics attached as well What should I do next? tower-diagnostics-20201207-1305.zip Quote Link to comment
JorgeB Posted December 8, 2020 Share Posted December 8, 2020 Now do a new config, keep all assignments as they were and re-sync parity. 1 Quote Link to comment
abhi.ko Posted December 8, 2020 Author Share Posted December 8, 2020 2 hours ago, JorgeB said: Now do a new config, keep all assignments as they were and re-sync parity. Oh I see now. Never used this New Config option before and I don't think it was even there when I built the server in 2011. Here is what I did. Stop the array. So Tools -->New Config -->Keep All Assignments -->Apply : This is done. Added Disk 15 back to the array and used the same disk as before. Started the array. Everything looks good as of now. Parity rebuild is in process. Will report back how it goes. Thanks a ton for your help. Will try and trouble shoot the disks not being detected issue soon after. Hopefully I won't mess anything more up in that process. Any guidance you can provide there would be helpful. BTW, I did check the BIOS Boot settings and 'option ROM' and 'UEFI and Legacy OPROM' is selected under boot devices control and CSM is enabled, still no luck getting the LSI BIOS to show up. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.