thefly Posted March 19, 2018 Share Posted March 19, 2018 After suffering numerous disk issues I have completed an uncorrected parity-check and I am left with (what I believe are) numerous disk irregularities. Can someone provide a next step(s) to deal with them. Attached is a screenshot of main. Thanks. Quote Link to comment
BradJ Posted March 19, 2018 Share Posted March 19, 2018 You should always upload your log files so people can help you troubleshoot. Quote Link to comment
JorgeB Posted March 19, 2018 Share Posted March 19, 2018 Looks like multiple disks dropped offline, likely a controller issue, grab current diagnostics, reboot, grab new diagnostics and post both. Quote Link to comment
bonienl Posted March 19, 2018 Share Posted March 19, 2018 Once your array is back in good shape, you should consider converting your disks from reiserFS to XFS. RFS is no longer developed and in your situation with near full disks will perform very poor. Quote Link to comment
thefly Posted March 20, 2018 Author Share Posted March 20, 2018 Restarted. See pre and post diagnostics. Disk 5, Disk 10 now gone tower-diagnostics-current.zip tower-diagnostics-restart.zip Quote Link to comment
JorgeB Posted March 20, 2018 Share Posted March 20, 2018 Disk 5, Disk 10 now gone You didn't have a disk5. Disk10 completely dropped offline: Mar 16 15:59:59 Tower kernel: ata12: hard resetting link Mar 16 15:59:59 Tower kernel: ata12.00: failed to read native max address (err_mask=0x1) Mar 16 15:59:59 Tower kernel: ata12.00: HPA support seems broken, skipping HPA handling Mar 16 15:59:59 Tower kernel: ata12.00: both IDENTIFYs aborted, assuming NODEV Mar 16 15:59:59 Tower kernel: ata12.00: revalidation failed (errno=-2) Mar 16 16:00:04 Tower kernel: ata12: hard resetting link Mar 16 16:00:04 Tower kernel: ata12.00: both IDENTIFYs aborted, assuming NODEV Mar 16 16:00:04 Tower kernel: ata12.00: revalidation failed (errno=-2) Mar 16 16:00:04 Tower kernel: ata12.00: disabled Check connections or try it in a different PC, if still not detected it's likely dead. Quote Link to comment
JorgeB Posted March 20, 2018 Share Posted March 20, 2018 Also, these disks are likely failing, do you have notifications enable? Run an extended SMART test to confirm: Device Model: ST2000DM001-9YN164 Serial Number: S1E0562C 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 8 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 8 Device Model: ST2000DM001-9YN164 Serial Number: S240BXD9 187 Reported_Uncorrect 0x0032 066 066 000 Old_age Always - 34 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 40 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 40 Device Model: ST3000DM001-9YN166 Serial Number: Z1F0VLHS 5 Reallocated_Sector_Ct 0x0033 067 052 036 Pre-fail Always - 44240 183 Runtime_Bad_Block 0x0032 097 097 000 Old_age Always - 3 187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 3117 197 Current_Pending_Sector 0x0012 001 001 000 Old_age Always - 20672 198 Offline_Uncorrectable 0x0010 001 001 000 Old_age Offline - 20672 Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WMAZ20246473 197 Current_Pending_Sector 0x0032 196 196 000 Old_age Always - 1462 Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WMAZ20266575 5 Reallocated_Sector_Ct 0x0033 041 041 140 Pre-fail Always FAILING_NOW 1265 196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 1265 197 Current_Pending_Sector 0x0032 196 196 000 Old_age Always - 1394 Quote Link to comment
JorgeB Posted March 20, 2018 Share Posted March 20, 2018 These disks looked familiar, and I see why, you could have mentioned your previous thread, saving me the time of going through all SMART reports again, also I see you didn't follow my previous advice of running extended SMART tests, but the errors on your first screenshot line up with the suspect disks, so that's confirmation enough they are failing. Still not sure on what you're trying to accomplish, unRAID can't keep working with more bad disks than parity disks, one in your case, so you basically have 2 options: 1) do a new config with all the good disks plus new disks to replace the failing ones, re-sync parity and then copy everything you can from the failing disks to the new ones, e.g., by mounting one at a time with the UD plugin. 2) clone all failing disks to new ones using ddrescue and then do a new config and re-sync parity. Quote Link to comment
trurl Posted March 20, 2018 Share Posted March 20, 2018 4 hours ago, johnnie.black said: you could have mentioned your previous thread, In fact, you should have just used your previous thread since all this is just a continuation of that dire situation you have allowed to happen. 4 hours ago, johnnie.black said: Still not sure on what you're trying to accomplish, unRAID can't keep working with more bad disks than parity disks, one in your case I don't recall, and I'm not going to go back and read it all again. Do you have backups? Maybe before doing anything else you should copy whatever irreplaceable and important files you may be able to access to your PC. You are in serious danger of losing a lot of data. A single parity disk cannot help with multiple drive failures. Quote Link to comment
thefly Posted March 21, 2018 Author Share Posted March 21, 2018 I'm stuck with Disk 2 showing missing, contents emulated and Disk 10 disabled, contents emulated. Obviously I can not start the array. I am resigned to losing these disks but would like some direction on the order of new drive replacements. Is there any way to get a directory of what was on the drives? Do I replace Disk 2 or 10 first? Depressing... Quote Link to comment
trurl Posted March 21, 2018 Share Posted March 21, 2018 6 minutes ago, thefly said: I'm stuck with Disk 2 showing missing, contents emulated and Disk 10 disabled, contents emulated. No disks are emulated, since you have single parity but 2 missing or disabled disks. The course of action suggested by johnnie.black seems like the best idea: 17 hours ago, johnnie.black said: 1) do a new config with all the good disks plus new disks to replace the failing ones, re-sync parity and then copy everything you can from the failing disks to the new ones, e.g., by mounting one at a time with the UD plugin. 2) clone all failing disks to new ones using ddrescue and then do a new config and re-sync parity. The 1st option is probably going to be the simplest for you. The 2nd option would actually have fewer steps but requires working carefully at the command line. Quote Link to comment
JorgeB Posted March 21, 2018 Share Posted March 21, 2018 7 hours ago, trurl said: The 1st option is probably going to be the simplest for you. The 2nd option would actually have fewer steps but requires working carefully at the command line. The 2nd option could be useful for example if the data is mostly videos and you don't have backups/source files, as in the 1st option any file with a single read error won't be copied, the 2nd option should copy most files skipping any errors, obviously the file will be corrupt but for videos files, and if it's just a few errors, it should still be playable with or without some glitches, still better than nothing when there are no backups. Quote Link to comment
thefly Posted March 21, 2018 Author Share Posted March 21, 2018 (edited) Both Disk 2 and 10 are mostly videos. I have no source back-up files. I have chosen ddrescue and have installed the plug in. As I understand it, if I place a new drive in free slot 19 and wish to clone disable device Disk 10, I would enter the following terminal command to attempt a clone??: ddrescue -f /dev/sd19 /dev/sd10 /boot/ddrescue.log Edited March 21, 2018 by thefly Quote Link to comment
trurl Posted March 21, 2018 Share Posted March 21, 2018 1 minute ago, thefly said: Both Disk 2 and 10 are mostly videos. I have no source back-up files. I have chosen ddrescue and have installed the plug in. As I understand it, if I place a new drive in free slot 19 and wish to clone disable device Disk 10, I would enter the following terminal command to attempt a clone??: ddrescue -f /dev/sd19 /dev/sd10 /boot/ddrescue.log Those devices don't exist. You need to use the drive letters corresponding to those disk numbers. Be very very careful here, since the drive letters assigned to a particular disk can change on each boot. Quote Link to comment
JorgeB Posted March 21, 2018 Share Posted March 21, 2018 2 minutes ago, thefly said: As I understand it, if I place a new drive in free slot 19 and wish to clone disable device Disk 10, I would enter the following terminal command to attempt a clone??: That's not correct, and since you need to do a new config anyway I would recommend cloning to a disk outside the array since it's faster. Quote Link to comment
trurl Posted March 21, 2018 Share Posted March 21, 2018 Also, I have never used this but it looks like you have the source and destination backwards. BE CAREFUL! Quote Link to comment
JorgeB Posted March 21, 2018 Share Posted March 21, 2018 11 minutes ago, trurl said: BE CAREFUL! Yes, from the link: ddrescue -f /dev/sdX /dev/sdY /boot/ddrescue.log Quote Both source and destination disks can't be mounted, replace X with source disk, Y with destination, always triple check these, if the wrong disk is used as destination it will be overwritten deleting all data. X and Y are the disks identifiers, like sdb, sdc, etc. Quote Link to comment
thefly Posted March 21, 2018 Author Share Posted March 21, 2018 Got it. So don't I add a disk using UD plug in and just clone to it? Quote Link to comment
JorgeB Posted March 21, 2018 Share Posted March 21, 2018 Yes, both disks just need to be connected to server. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.