mrbens Posted September 11, 2021 Share Posted September 11, 2021 (edited) Hello, in need of assistance please. Attached diagnostics from time of issue before reboot and today. Had an error pop up saying disk 7 failed. Saved diagnostic logs and rebooted. Plan was to try a preclear of disk 7 to see if I could reintroduce it again. Syslog is full of disk 1 errors but can't see anything for disk 7. Disk 1 then showed missing after reboot and has been missing since so can't start the server or try rebuild either disk. Further reboot showed disk 3 also missing but that's been showing up ok again since that one time. Struggling to get the server to see disk 1. Clicking 'no device' doesn't show any disks in the drop down menu. Swapped round SATA cables between disk 1 and disk 2 (both go into same 2 port SATA card) but disk 1 still shows missing. Swapped round SATA cables between disk 1 and parity (parity goes straight into motherboard) and disk 1 still shows missing. Replaced the 4 port SATA power splitter cable that goes into parity, disk 1, 2 & 3 and disk 1 still shows missing. Swapped PSU SATA power cable that goes into the 4x SATA splitter with another splitter but disk 1 still missing. Connected a PSU SATA power cable without a splitter to disk 1 and disk 1 still missing. Disk 7 SMART data: Disk 7 SMART overall-health: Passed When the server powers up there is a hard disk clicking sound for a few seconds which is probably from disk 1 as it happens when disk 7 power cable is removed before booting. I wonder if disk 7 might be OK to reintroduce somehow so I can rebuild disk 1? Is there a way to try recover from this without losing all the data on both disks please? Thanks, Ben tower-diagnostics-20210816-0608 [before reboot].zip tower-diagnostics-20210911-1400 [latest info].zip Edited September 11, 2021 by mrbens Quote Link to comment
JorgeB Posted September 12, 2021 Share Posted September 12, 2021 Disk1 is likely dead and disk7 appears to be failing, if that's the case you can't rebuild disk1 with single parity, still and since there's no SMART report for disk7 can't see if a SMART test was run or not, try to run an extended test to confirm if the disk really failed. Quote Link to comment
mrbens Posted September 12, 2021 Author Share Posted September 12, 2021 6 hours ago, JorgeB said: Disk1 is likely dead and disk7 appears to be failing, if that's the case you can't rebuild disk1 with single parity, still and since there's no SMART report for disk7 can't see if a SMART test was run or not, try to run an extended test to confirm if the disk really failed. Thanks for the reply. The SMART test fails at 10%. Tried twice. Attached the output. Short test also won't run and says "Errors occurred - Check SMART report". tower-smart-20210912-1530.zip Quote Link to comment
JorgeB Posted September 13, 2021 Share Posted September 13, 2021 You can use ddrescue on disk7 to recover as much data as possible, if disk1 is really dead not much you can do about that. Quote Link to comment
trurl Posted September 13, 2021 Share Posted September 13, 2021 Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Don't allow one problem to be ignored until you get multiple problems and data loss. Quote Link to comment
mrbens Posted September 13, 2021 Author Share Posted September 13, 2021 Thanks JorgeB, I'll try ddrescue. I only got a notification of disk 7 being disabled, nothing for disk 1. What's the best way to start the array and build parity from the remaining disks please? Quote Link to comment
JorgeB Posted September 13, 2021 Share Posted September 13, 2021 Tools -> New config Assign remaining disks and start array to begin a parity sync. Quote Link to comment
mrbens Posted September 17, 2021 Author Share Posted September 17, 2021 Thanks JorgeB Quote Link to comment
mrbens Posted December 22, 2021 Author Share Posted December 22, 2021 (edited) I'm going to try ddrescue on the disk 7 that was previously disabled due to errors. Current SMART report: It doesn't show a file system and mount is grayed out. When the array is stopped it does show the disk as being available to re-add to the array but I guess that's not a good idea. The other unassigned disk sdl is a spare previously used in the array that I rebuilt with a larger disk and still has the data on. What's the best way to try ddrescue of sdd onto sdl please? Should I preclear sdl and add it to the array to do it that way to save having to copy the recovered data over or better to leave it unassigned without preclearing and copy any data it can recover onto the array manually after? If leaving unmounted is this definitely the correct command: ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log Thanks Edited December 22, 2021 by mrbens Quote Link to comment
JorgeB Posted December 22, 2021 Share Posted December 22, 2021 46 minutes ago, mrbens said: If leaving unmounted is this definitely the correct command: ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log Yes. Quote Link to comment
mrbens Posted December 23, 2021 Author Share Posted December 23, 2021 Thanks again. I got this far before it stopped: root@Tower:~# ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log GNU ddrescue 1.23 Press Ctrl-C to interrupt ipos: 1327 GB, non-trimmed: 58475 MB, current rate: 0 B/s opos: 1327 GB, non-scraped: 0 B, average rate: 53796 kB/s non-tried: 1672 GB, bad-sector: 0 B, error rate: 2370 MB/s rescued: 1269 GB, bad areas: 0, run time: 6h 33m 12s pct rescued: 42.29%, read errors: 892328, remaining time: n/a time since last successful read: 39s Copying non-tried blocks... Pass 5 (forwards) ddrescue: Input file disappeared: No such file or directory Both disks are still showing under Unassigned Devices. Should I try running the same command again to see if it resumes? Quote Link to comment
JorgeB Posted December 23, 2021 Share Posted December 23, 2021 6 minutes ago, mrbens said: Input file disappeared This means the disk dropped. 6 minutes ago, mrbens said: Both disks are still showing under Unassigned Devices. With the same identifier? 7 minutes ago, mrbens said: Should I try running the same command again to see if it resumes? It will resume if you use the same log file. Quote Link to comment
mrbens Posted December 23, 2021 Author Share Posted December 23, 2021 Thank you. Yes they both show the same device letters. ddrescue is running again and has resumed where it left off. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.