Imba Posted May 18, 2018 Share Posted May 18, 2018 I just resently had to replace a dead flash drive for the server. Right now it's in parity-sync but it's gone from about 1000 minutes to 316303.2 minutes. I noticed a drive has over 6k errors and need to replaced but can I do this before the sync is done? Is there anything I can do or do I have to wait it out? UnRaid: Ver 4.7 Link to comment
trurl Posted May 18, 2018 Share Posted May 18, 2018 Moved to Legacy Support. How is it that you are just now coming to the forum with your first post and it's about a very very old version of unRAID? Many of us haven't worked with that version, and I just barely remember it myself. Stop the parity sync until we can get a better idea of what the problem is. Unfortunately, getting useful diagnostics from that old version is a lot more trouble than we have to go to on the latest versions. We need the syslog and SMART report for that drive giving errors. Even better would be syslog and SMART report for all drives. If you were on V6 you could get all of this in a nice zip to post for us, but instead you will have to get each separately. If you want you could zip them yourself and then you would only need to attach one thing to your next post. See here: https://lime-technology.com/forums/topic/9277-how-to-report-a-defect-and-capture-syslog-and-smart-reports/ Link to comment
trurl Posted May 18, 2018 Share Posted May 18, 2018 Be sure to read the first several posts at that link I gave so you know how to get syslog and SMART reports. Did you have to go into the case to replace the flash drive? Sometimes people will disturb the disk connections if they open the case. It would also be useful if you could tell us a little about your hardware. It would be nice if you can easily upgrade to V6 after we get this problem squared. Link to comment
Imba Posted May 18, 2018 Author Share Posted May 18, 2018 Well I've never had a problem with 4.7 so I didn't think there was a reason to upgrade, although I have seen the nifty things that the new versions offer. Plus I thought you had to pay for another license. I've attached a zip file with all the smart reports for each drive and the system log. As for the hardware: MB: ASRock FM2A85X Extreme6 CPU: AMD A4-5300 Mem: 1GB DDR3-1066 Controllers: Adaptec 1430SA x 2 Is there anything else you need? Many thanks! HIVE Syslog and Smart Reports.zip Link to comment
BobPhoenix Posted May 19, 2018 Share Posted May 19, 2018 I'm still using my USB and license that I got when I had 4.7. Upgrades are free and so far it doesn't look like that will change any time soon. But I would probably pay for upgrades if I was stuck on a version that only supports 2TB drives like 4.7. And then again when the VM manager and Docker were added. Link to comment
trurl Posted May 19, 2018 Share Posted May 19, 2018 You have multiple disks with issues. Unfortunately, the syslog has rotated and is only showing all the recent errors, but none of the old information that would make it possible for me to identify each disk by their assigned slot. I can see the serial numbers in the SMART though and that will be enough. You could perhaps get older syslogs from /var/log/syslog.1, /var/log/syslog.2, etc. but it's probably not necessary. The latest unRAID makes all this much easier. Also, the latest unRAID also helps you to keep track of impending issues by notifying you immediately by email or other agent, for example, when a disk SMART begins to show problems. We may have problems saving all your data in this current state since you have multiple unreliable disks, and parity plus all other disks must be read reliably in order to rebuild any disk. The disk you have labeled SMART as having errors is actually FAILING NOW and must be replaced immediately. Unfortunately, you also have 2 other disks with pending sectors and so they can't really be trusted to accurately rebuild the failing disk. Those disks should be replaced also, ASAP, but of course you can only rebuild one at a time and the FAILING NOW disk takes priority. I guess we will have to start there and hope for the best, possibly if you wind up with a corrupt rebuild we can repair the filesystem and save most things. Why did you decide to do a parity sync anyway? That probably has corrupted parity somewhat, which of course also makes an accurate rebuild unlikely. One last comment, you don't really have enough RAM for an upgrade to V6. I haven't checked the specs for the other hardware. Let us know if you need more details about how to proceed with rebuilding the failing disk to a new disk. Device Model: Hitachi HDS723020BLA642 Serial Number: MN1220F326RLAD 5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 1975 196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 2392 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 31 Device Model: WDC WD20EARS-22MVWB0 Serial Number: WD-WCAZA3935061 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 1 196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 17 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1 Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA5742681 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 8 Link to comment
trurl Posted May 19, 2018 Share Posted May 19, 2018 Another approach would be to create a new array with only the good disks, sync parity, then see if you can mount the bad disks outside the array (another thing that is much simpler in V6) and try to copy their contents. That has the advantage of getting the good disks protected, but it means you can't really rebuild any of the bad disks and will just have to hope you can read them well enough to get something off them. Do you have any backups? Link to comment
JorgeB Posted May 19, 2018 Share Posted May 19, 2018 You should run an extended test on those WD disks with pending sectors, they can some time show false positives, i.e., the disks may be fine for now, the Hitachi is definitely failing. Link to comment
Imba Posted May 19, 2018 Author Share Posted May 19, 2018 Sigh, well if I replace the failing disk how would I go about it? Would it be the same steps to replace the other two drives? Also, I didn't actually start the parity sync it did it on it's own when I replaced the usb and started it back up. Link to comment
JorgeB Posted May 19, 2018 Share Posted May 19, 2018 Parity isn't valid, best way forward is doing like trurl suggested, update unRAID them do a new config and copy the data from the failing disk(s). Link to comment
trurl Posted May 19, 2018 Share Posted May 19, 2018 4 hours ago, Imba said: I didn't actually start the parity sync it did it on it's own when I replaced the usb and started it back up. It must have seen super.dat that was copied without the array stopped and assumed unclean shutdown. Latest unRAID does a non-correcting parity check on unclean shutdown. Can you add RAM? Maybe V6 NAS capability only would work with 1GB but it would be very tight. Here is a link to the upgrading wiki: https://lime-technology.com/wiki/index.php/Upgrading_to_UnRAID_v6 Link to comment
JorgeB Posted May 19, 2018 Share Posted May 19, 2018 It was doing a parity sync, not a check, so something more serious happened, and with all the errors on disk2 it won't be valid anymore. Link to comment
pwm Posted May 19, 2018 Share Posted May 19, 2018 Older versions of unRAID was more interested in doing corrective parity sync. Didn't all versions before version 6 default to have the 'correcting' checkbox set even if someone wanted to manually start a parity scan? Link to comment
JorgeB Posted May 20, 2018 Share Posted May 20, 2018 18 hours ago, pwm said: Older versions of unRAID was more interested in doing corrective parity sync. I believe you mean check, sync are always write. 18 hours ago, pwm said: Didn't all versions before version 6 default to have the 'correcting' checkbox set even if someone wanted to manually start a parity scan? It still does, you need to uncheck the "write corrections to parity" box before starting a non correcting manual check, though on newer releases it does default to non correct after an unclean shutdown. Link to comment
pwm Posted May 20, 2018 Share Posted May 20, 2018 unRAID really should stay away from writing corrections unless the user more or less forces that operation. In case there is something wrong, the user should be given the full set of options of what steps to try to recover - which means the most recent parity must be left intact. Link to comment
Imba Posted May 20, 2018 Author Share Posted May 20, 2018 Ok so I'm confused as to what steps I should be taking, I can replace the failing hard drive and more than likely add more RAM. But I don't understand how to go about all this. Upgrade before anything? Replace drive first? How to get info from the failing drives? I'm sorry UnRAID seems to be beyond the limits of my usual comprehension. Link to comment
trurl Posted May 21, 2018 Share Posted May 21, 2018 On 5/18/2018 at 11:07 PM, trurl said: Do you have any backups? This is probably the first thing to consider. If you have any important and irreplaceable files that you don't have backed up then try to copy them from the server to your PC. Link to comment
Imba Posted May 29, 2018 Author Share Posted May 29, 2018 I don't have many overly important files on the server, but there are some things that I would like to save of course. So should I just start the server again, stop the sync, and try to pull from the failing drive? Or do I have to pull files in general (e.g. from shares as oppose to the drive that is failing). Link to comment
JorgeB Posted May 29, 2018 Share Posted May 29, 2018 Did you run an extended test on the drives with pending sectors to confirm if they are failing or not? On 5/19/2018 at 7:41 AM, johnnie.black said: You should run an extended test on those WD disks with pending sectors, they can some time show false positives, i.e., the disks may be fine for now, the Hitachi is definitely failing. Link to comment
JorgeB Posted May 29, 2018 Share Posted May 29, 2018 On the main page click on the disk, scroll down to Self Test section then click start on "SMART extended self-test" Link to comment
Imba Posted June 7, 2018 Author Share Posted June 7, 2018 Sigh, well it looks like I don't have that option. Link to comment
trurl Posted June 7, 2018 Share Posted June 7, 2018 52 minutes ago, Imba said: Sigh, well it looks like I don't have that option. I think you would have to do that from the command line in V5. Or if you have unMenu maybe it would have something for this. I found this by searching the wiki: https://lime-technology.com/wiki/Console_commands_for_hard_drives Link to comment
Imba Posted June 7, 2018 Author Share Posted June 7, 2018 I guess I'll have to use the command line, should I run the short or long test? Link to comment
pwm Posted June 7, 2018 Share Posted June 7, 2018 Long test - only that one will scan the surface. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.