September 11, 20214 yr Hello, in need of assistance please. Attached diagnostics from time of issue before reboot and today. Had an error pop up saying disk 7 failed. Saved diagnostic logs and rebooted. Plan was to try a preclear of disk 7 to see if I could reintroduce it again. Syslog is full of disk 1 errors but can't see anything for disk 7. Disk 1 then showed missing after reboot and has been missing since so can't start the server or try rebuild either disk. Further reboot showed disk 3 also missing but that's been showing up ok again since that one time. Struggling to get the server to see disk 1. Clicking 'no device' doesn't show any disks in the drop down menu. Swapped round SATA cables between disk 1 and disk 2 (both go into same 2 port SATA card) but disk 1 still shows missing. Swapped round SATA cables between disk 1 and parity (parity goes straight into motherboard) and disk 1 still shows missing. Replaced the 4 port SATA power splitter cable that goes into parity, disk 1, 2 & 3 and disk 1 still shows missing. Swapped PSU SATA power cable that goes into the 4x SATA splitter with another splitter but disk 1 still missing. Connected a PSU SATA power cable without a splitter to disk 1 and disk 1 still missing. Disk 7 SMART data: Disk 7 SMART overall-health: Passed When the server powers up there is a hard disk clicking sound for a few seconds which is probably from disk 1 as it happens when disk 7 power cable is removed before booting. I wonder if disk 7 might be OK to reintroduce somehow so I can rebuild disk 1? Is there a way to try recover from this without losing all the data on both disks please? Thanks, Ben tower-diagnostics-20210816-0608 [before reboot].zip tower-diagnostics-20210911-1400 [latest info].zip Edited September 11, 20214 yr by mrbens
September 12, 20214 yr Community Expert Disk1 is likely dead and disk7 appears to be failing, if that's the case you can't rebuild disk1 with single parity, still and since there's no SMART report for disk7 can't see if a SMART test was run or not, try to run an extended test to confirm if the disk really failed.
September 12, 20214 yr Author 6 hours ago, JorgeB said: Disk1 is likely dead and disk7 appears to be failing, if that's the case you can't rebuild disk1 with single parity, still and since there's no SMART report for disk7 can't see if a SMART test was run or not, try to run an extended test to confirm if the disk really failed. Thanks for the reply. The SMART test fails at 10%. Tried twice. Attached the output. Short test also won't run and says "Errors occurred - Check SMART report". tower-smart-20210912-1530.zip
September 13, 20214 yr Community Expert You can use ddrescue on disk7 to recover as much data as possible, if disk1 is really dead not much you can do about that.
September 13, 20214 yr Community Expert Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Don't allow one problem to be ignored until you get multiple problems and data loss.
September 13, 20214 yr Author Thanks JorgeB, I'll try ddrescue. I only got a notification of disk 7 being disabled, nothing for disk 1. What's the best way to start the array and build parity from the remaining disks please?
September 13, 20214 yr Community Expert Tools -> New config Assign remaining disks and start array to begin a parity sync.
December 22, 20214 yr Author I'm going to try ddrescue on the disk 7 that was previously disabled due to errors. Current SMART report: It doesn't show a file system and mount is grayed out. When the array is stopped it does show the disk as being available to re-add to the array but I guess that's not a good idea. The other unassigned disk sdl is a spare previously used in the array that I rebuilt with a larger disk and still has the data on. What's the best way to try ddrescue of sdd onto sdl please? Should I preclear sdl and add it to the array to do it that way to save having to copy the recovered data over or better to leave it unassigned without preclearing and copy any data it can recover onto the array manually after? If leaving unmounted is this definitely the correct command: ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log Thanks Edited December 22, 20214 yr by mrbens
December 22, 20214 yr Community Expert 46 minutes ago, mrbens said: If leaving unmounted is this definitely the correct command: ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log Yes.
December 23, 20214 yr Author Thanks again. I got this far before it stopped: root@Tower:~# ddrescue -f /dev/sdd /dev/sdl /boot/ddrescue.log GNU ddrescue 1.23 Press Ctrl-C to interrupt ipos: 1327 GB, non-trimmed: 58475 MB, current rate: 0 B/s opos: 1327 GB, non-scraped: 0 B, average rate: 53796 kB/s non-tried: 1672 GB, bad-sector: 0 B, error rate: 2370 MB/s rescued: 1269 GB, bad areas: 0, run time: 6h 33m 12s pct rescued: 42.29%, read errors: 892328, remaining time: n/a time since last successful read: 39s Copying non-tried blocks... Pass 5 (forwards) ddrescue: Input file disappeared: No such file or directory Both disks are still showing under Unassigned Devices. Should I try running the same command again to see if it resumes?
December 23, 20214 yr Community Expert 6 minutes ago, mrbens said: Input file disappeared This means the disk dropped. 6 minutes ago, mrbens said: Both disks are still showing under Unassigned Devices. With the same identifier? 7 minutes ago, mrbens said: Should I try running the same command again to see if it resumes? It will resume if you use the same log file.
December 23, 20214 yr Author Thank you. Yes they both show the same device letters. ddrescue is running again and has resumed where it left off.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.