graywolf Posted November 20, 2021 Posted November 20, 2021 (edited) Was replacing Parity drive with a larger drive. Last Parity Check (11/6/2021) was good without errors. Got messages that Disk18 (6TB) has tons of Read Errors and 6 Pending Sector errors. Parity-Sync is still running (past 6TB) I assume that the Parity-Sync is invalid? Advice on how best to proceed? tower-diagnostics-20211120-1525.zip Edited November 25, 2021 by graywolf update title Quote
graywolf Posted November 20, 2021 Author Posted November 20, 2021 Additional info. It looks like a smart report was not available for Disk18 in the Diagnostics. Tried doing it individually and got the attached which said: Smartctl open device: /dev/sdu failed: No such device Also noticed that it looks like /var/log is at 100% Quote root@Tower:/var/log# df Filesystem 1K-blocks Used Available Use% Mounted on rootfs 8158744 669184 7489560 9% / tmpfs 32768 408 32360 2% /run devtmpfs 8158760 0 8158760 0% /dev tmpfs 8210092 0 8210092 0% /dev/shm cgroup_root 8192 0 8192 0% /sys/fs/cgroup tmpfs 131072 131072 0 100% /var/log /dev/sda1 15000232 393904 14606328 3% /boot /dev/loop0 8832 8832 0 100% /lib/modules /dev/loop1 6016 6016 0 100% /lib/firmware /dev/md1 5858435620 5370817808 487617812 92% /mnt/disk1 /dev/md2 5858435620 5389663876 468771744 92% /mnt/disk2 /dev/md3 5858435620 5383870316 474565304 92% /mnt/disk3 /dev/md4 5858435620 5399408132 459027488 93% /mnt/disk4 /dev/md5 7811939620 7327696028 484243592 94% /mnt/disk5 /dev/md6 5858435620 5372153508 486282112 92% /mnt/disk6 /dev/md7 5858435620 5375539600 482896020 92% /mnt/disk7 /dev/md8 5858435620 5370944128 487491492 92% /mnt/disk8 /dev/md9 5858435620 5371322200 487113420 92% /mnt/disk9 /dev/md10 5858435620 5373463284 484972336 92% /mnt/disk10 /dev/md11 5858435620 5370387596 488048024 92% /mnt/disk11 /dev/md12 5858435620 5376001688 482433932 92% /mnt/disk12 /dev/md13 5858435620 5408146252 450289368 93% /mnt/disk13 /dev/md14 5858435620 5373271592 485164028 92% /mnt/disk14 /dev/md15 5858435620 5371518072 486917548 92% /mnt/disk15 /dev/md16 5858435620 5312117804 546317816 91% /mnt/disk16 /dev/md17 5858435620 5310747332 547688288 91% /mnt/disk17 /dev/md18 5858435620 5335694900 522740720 92% /mnt/disk18 /dev/md19 7811939620 6899110408 912829212 89% /mnt/disk19 /dev/md20 5858435620 4946753272 911682348 85% /mnt/disk20 /dev/md21 5858435620 5187500852 670934768 89% /mnt/disk21 /dev/md22 5858435620 4941532744 916902876 85% /mnt/disk22 /dev/sdv1 117220792 16768 116158592 1% /mnt/cache shfs 132792591640 120567661392 12224930248 91% /mnt/user0 shfs 132909812432 120567678160 12341088840 91% /mnt/user /dev/loop2 41943040 2876936 37220056 8% /var/lib/docker WDC_WD60EFRX-68L0BN1_WD-WX11D26H4CE2-20211120-1121.txt Quote
JorgeB Posted November 21, 2021 Posted November 21, 2021 16 hours ago, graywolf said: It looks like a smart report was not available for Disk18 It wasn't, power cycle the server (don't just reboot) and post new diags. Quote
graywolf Posted November 21, 2021 Author Posted November 21, 2021 (edited) Did power down, then restarted server. Since no parity, I put in a new drive and am trying to copy the contents of Disk18 to Disk23 (unBalance Copy) now to see what I can get. Still getting read errors from disk18. Attaching current diagnostics file any help/advice would be greatly appreciated or if there is a better/faster process than unBalance Copy to transfer what I can from disk18 to disk23 tower-diagnostics-20211121-1351.zip Edited November 21, 2021 by graywolf Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 Disk18 is failing, if there were no writes to the array since you replaced parity you can try to use the old one to rebuild disk18, if there were writes best bet is to use ddrescue on it. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 There were writes unfortunately. So running without parity. I have never used ddrescue and don't want to screw up more. It says that "Both source and destination disks can't be mounted" So what is the best way to unmount disk18? With disk18 unmounted, can the rest of the array be running? Does the destination drive (precleared) need to be formatted first? Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 22 minutes ago, graywolf said: It says that "Both source and destination disks can't be mounted" That just means you need to stop the array, if you want to keep using the array without disk18 and can do a new config without it. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 Running ddrescue. is there a good link that tells me what the different labels mean? Some are obvious but others not. i.e. non-trimmed = space that is bad during 1st pass? But that doesn't seem to match what I get when I subtract the rescued ammount from the opos number. Also, Bad Sectors and Bad Areas and Error rate are 0 but have 76 read errors. Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 The most important part is the percent rescued, in the end it's usually 99.9% for most cases, though it can be much less for really bad disks, there's also an option mentioned in the FAQ that can list the affected files, so you can delete/restore them. Quote
trurl Posted November 22, 2021 Posted November 22, 2021 On 11/20/2021 at 10:35 AM, graywolf said: Disk18 (6TB) has tons of Read Errors and 6 Pending Sector errors. Do you have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected? Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 I set it up last month, prior to Nov 6th Parity Check which ran without issue. Never saw anything for Disk18 until the Parity-Resync when I tried to replace the Parity drive with a larger drive. Unfortunately stuff had written to the array during that process so the old parity drive wouldn't be usable. Quote
trurl Posted November 22, 2021 Posted November 22, 2021 2 minutes ago, graywolf said: set it up last month, prior to Nov 6th Parity Check which ran without issue. I have had pending sectors popup suddenly during parity operations also since they make it access every part of the disk. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 so "ddrescue -f --fill=- ~/fill.txt /dev/sdY /boot/ddrescue.log" will add the text inside the file that has bad blocks but if I understand correctly, that shouldn't matter for any type of file (txt, pdf, jpg, mkv, etc) since that file is corrupted anyways and would not open/play correctly. Is that correct? Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 Yes, video files might still play, but there could be some visible corruption during play. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 Interesting. Just to clarify, if a video file did play, the visible corruption would be because of the bad blocks anyways and not the addition of the text from the ddrescue fill cmd? Otherwise, is there another method other than adding the text with the ddrescue --fill to correlate the bad blocks with the files? Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 33 minutes ago, graywolf said: if a video file did play, the visible corruption would be because of the bad blocks anyways and not the addition of the text from the ddrescue fill cmd? Correct. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 I read that after ddrescue completes it's thing that "After the clone is complete you can mount the destination disk manually" What would be the command to do that? I don't have UD plugin installed And would mounting the disk be the same as adding it to the array? (no parity currently) Quote
JorgeB Posted November 22, 2021 Posted November 22, 2021 5 minutes ago, graywolf said: And would mounting the disk be the same as adding it to the array? (no parity currently) Without parity you can add it to the array to an empty slot without issues, if you want to add it to a previously assigned disk slot you still can but need to do a new config first. Quote
graywolf Posted November 22, 2021 Author Posted November 22, 2021 "ddrescue -f --fill=- ~/fill.txt /dev/sdY /boot/ddrescue.log" What if a bad sector did not have any file(s)? does it create a file and fill it with the text? Quote
JorgeB Posted November 23, 2021 Posted November 23, 2021 10 hours ago, graywolf said: What if a bad sector did not have any file(s)? It doesn't create files, just change those sectors. Quote
graywolf Posted November 23, 2021 Author Posted November 23, 2021 What should I do now at this point? ddrescue: Input file disappeared: No such file or directory looks like the source disk went offline? Quote
JorgeB Posted November 23, 2021 Posted November 23, 2021 9 minutes ago, graywolf said: looks like the source disk went offline? Looks like it dropped offline, you can try power cycling the server, if the disk comes back online run the command again (check the drives identifiers are still the same, if not adjust as necessary), use the same log file and it will resume. Quote
graywolf Posted November 23, 2021 Author Posted November 23, 2021 Could I stop the ddrescue scraping failed blocks, then mount the destination drive to check on a couple files (and maybe copy them to another drive), then stop array, unmount the drive, and then restart ddrescue again? Quote
JorgeB Posted November 23, 2021 Posted November 23, 2021 10 minutes ago, graywolf said: Could I stop the ddrescue scraping failed blocks, then mount the destination drive to check on a couple files (and maybe copy them to another drive), then stop array, unmount the drive, and then restart ddrescue again? That's OK if you mount the disk read only, you can do that with UD, don't mount read/write or some data will be changed just by mounting/unmounting and that can cause issues. Quote
graywolf Posted November 23, 2021 Author Posted November 23, 2021 Once I'm done with ddrescue, it talks about being a good idea to run a file system check on the destination drive. How do I do that for a XFS disk? Do I mount it before the file system check or leave it unmounted? About how long should a file system check run on a 6TB drive (if size of disk matters)? Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.