fido666

Members
  • Posts

    150
  • Joined

  • Last visited

Everything posted by fido666

  1. I am in the process of migrating an existing array created under 4.7 Plus to 5.0.5 after experiencing data drive failure on the old setup. The array had been reset to a new config that didn't include any of the previously failed disks, the original parity drive was re-assigned as the parity drive on the new config. I was at the point where a parity sync was running with an estimated completion time of around 4 hours (drives are all 2TB). It had been running for less than 30 minutes when I heard a drive spin down and I noticed the sync had stopped and the Parity drive was now red balled. I have included a Syslog and Smartctl reports for all the connected drives but it looks like the parity drive has died, it was not able to be read. While the parity sync was running I was adjusting the NFS shares, not sure if this has interfered with the sync in some way. Syslog_21062014_.txt smart_reports.zip
  2. Thanks Jonathanm. So if I leave the parity slot empty for now then that means I can assign the original parity drive or a larger replacement later right? Without wiping them unless they are unformatted (as in not Reiserfs formatted) right? Edit: I have gone with frank1940's earlier suggestion to complete the migration to 5.0.5 prior to replacing the parity drive with a precleared replacement. My existing array is now in the process of completing a parity-sync. I have also checked that the new blank drive is visible to the preclear script but haven't fired it off yet, will wait till the party sync has completed and I'm satisfied the server is back on the air. Thanks everyone for your suggestions, they have been invaluable in aiding to restore my server .
  3. Thanks Frank, yes I have all the serials noted down. I was going to replace the parity drive with a new disk as part of the upgrade but would it be best just to get 5.0.5 going with the old drives (remaining good data drives and the parity) first and then worry about swapping it out? An earlier suggestion in another thread (http://lime-technology.com/forum/index.php?topic=33771.0) was to create the array without assigning parity and then substitute the new disk before assigning it as the parity drive. Edit: Doh! I feel like such a goose . I have solved the issue with the console not starting on the old 4.7 config. Somewhere along the line while prepping for the upgrade to 5.0.5 I had issued a "mv /boot/config/go /boot/config/go.txt" command as per the guide linked in the 1st post of this thread. Therefore I had no go file and UnMenu wouldn't start. The 4.7 console now starts and shows the 2 missing disks so all solved. The question I posted above still remains however. Edit2: OK, I have followed the upgrade procedure in the Release Notes under "If you are currently running:" then "Version 4.7". I have successfully booted into the 5.0.5 console and webGUI but it is showing the old config with the 2 missing disks, the array has not started ("too many missing disks" message). All disks are listed in the correct positions. There is no cache assigned but it is visible via the drop down menu. Do I simply un-assign the 2 missing disks (can't see any obvious way to do this) or should I use the New Config icon on the Utils tab to reset the configuration? What about the Parity drive? Remember I want to replace that one with a larger drive since my parity needs to be rebuilt from scratch anyway.
  4. Do I need to do the migration to 5.0.x with the new config process prior to attempting to preclear the new disks? I tried to substitute one of the new HDDs into my old 4.7 setup knowing the array would not be able to start but the Preclear script is also reporting "No un-assigned disks detected". Is this because I'm on Plus and already had 7 disks in the old config? I know the new drive is being detected by the BIOS OK. I can't even get the UnMenu GUI to come up on my old 4.7 setup now (can't connect using either the host name or IP address) but can log into the server from the console just fine.
  5. It was the DOA aspect I was concerned with. I guess just popping them in to the Windows box and making sure they register is a good enough quick test prior to running a Preclear on the UnRAID box. Thanks, I was aware of that from my initial setup using the 2TB drives The 2TB drives took 24-27hrs each from memory so 100hrs for 2 cycles on a 4TB sounds about right. Don't think I'll have time to do multiple runs, I'm under pressure to get the box back on air again.
  6. Update: I was able to successfully recover all the files (at least I think it was all of them) from one of the drives using Disk Internal Linux Reader, have saved those files to an external drive for now. The other drive was toast, looks like the motor may have been on the way out. Both drives have now been sent in for RMA. I am going to update to 5.0.5 using the remaining good drives plus 2 new larger replacements for the failed drives. Will update here on my success or otherwise with that process.
  7. I have picked up a couple of 4TB drives to replace my failed 2TB drives. As previously posted I am going to upgrade to 5.0.5 at the same time by doing a new config that doesn't include the failed drives. As the disks are brand new I'm guessing a Preclear will still be required? As the Preclear will take a very long time would it be advisable to do at least a single pass surface scan on them via my Windows PC first?
  8. I gave up on the other drive as it just kept causing Linux Viewer to hang, when it was pulled from the PC it was also very hot (not normal for a Coolspin). Both drives have now been sent back for RMA via the retailer. Since that's going to take up to 2 weeks I have elected to purchase a couple of Hitachi Deskstar NAS 4TB drives in the interim, fingers crossed my HP Microserver plays nice with them .
  9. I've recovered all the files off one of the failing drives as far as I can tell, thankyou Linux Viewer! The other drive reported imminent failure via SMART on booting up the PC after slotting that drive in. I can't recall how long the last drive took to read the file tree via Linux Viewer but this one has been going for over an hour and the status bar hasn't moved. How long did your drives take Drawde?
  10. That's what I thought, I just had it in my head that the disks might get re-formatted when re-starting the reconfigured array. Yes, someone pointed that out to me in another thread. I've screen dumped all the config pages prior to shutting down my old array. Just UnMenu and the Preclear script from memory, everything else as standard. Have done a file level copy of the USB, assume this is sufficient to capture the license and config files. Many thanks for the tip Shall do!
  11. Dumb question time . I am planning to migrate to 5.0.5 after having 2 drives fail from my existing 4.7 array. I'm in the process of recovering the files off the 2 drives via my Windows PC and Linux Reader. The plan is to set up a new config using just the remaining good parity, cache, and data drives plus a spare drive I have available (needs a preclear first). I know parity will need to be rebuilt from scratch but will the data from my existing good data drives still be visible when added to the new array?
  12. I had come to the same conclusion, especially with one of the drives at risk of imminent failure.
  13. Sorry Drawde, that wasn't my intention when I first posted in one of your threads but you know how things snowball right ? I'm using that same viewer (typo corrected in previous post), slow but it's working. I'm copying the files to a 4TB USB external drive, all I did was set up a directory on that drive and told Linux Reader to preserve the existing directory structure when copying the files. I'm going to leave it to do it's thing now and hit the sack, have to be up for work in under 7 hours
  14. That might be a good idea. All I want to do for now is take the suspect drives offline and see if I can view any of the contents on my Windows PC using a Linux viewer. Edit: Initial signs looking at the 1st drive with DiskInternals Linux Reader are good, I can see the directory structure and files within the directories. This was the disk that was undergoing the rebuild however so I don't know how healthy those files are yet. Copying them across to an external HDD anyway, fingers crossed! I think this drive may physically be OK and the earlier problem that started all this fun was connection related, there's nothing untoward in the SMART registers anyway.
  15. OK, good to know. I've just been looking at this article in the Wiki for guidance on the upgrade process :- http://lime-technology.com/wiki/index.php/Migrating_from_unRAID_4.7_to_unRAID_5.0 Seems a bit tricky! Was your experience relatively painless? Oh and how did you back up your USB? Did you make an image or just do a straight file copy?
  16. So if I have understood correctly it is only necessary to know which drives were the good data, parity, and cache drives by serial number but the port assignments are OK to change. Obviously since I'll be removing 2 drives from the array the port assignments may change.
  17. OK thanks itimpi, I will give that a shot so wish me luck I'm going to cancel the current rebuild after getting a screen dump of the existing drive assignments. Do I need to un-assign the suspect disks from the array after stopping the array or will the upgrade to v5 negate this step?
  18. I had a feeling that was the case, it makes sense, was just hoping for a miracle So it's the "initconfig" procedure I need to follow? http://lime-technology.com/wiki/index.php/Un-Official_UnRAID_Manual#Remove_one_or_more_data_disks And can I just cancel the current rebuild before attempting it? Is that safe to do on a compromised existing array? I realise this but then if I let the failing disk die I will lose the data anyway so at least your way I have a chance. My apologies to the OP for jumping in his thread but we seem to be in the same situation so I hope you don't mind us sharing.
  19. So would it be best to remove the drive that is throwing the multiple read errors first given it is the most likely of the two to fail completely? I'm assuming it's the "Replace a failed disk" procedure that needs to be followed? That still means stopping the rebuild that's already running on the other drive though so will that cause issues?
  20. I wonder if this is what I should be doing to try and fix the problem I'm having with 2 drives in my array (refer my existing thread here :- http://lime-technology.com/forum/index.php?topic=33745.0)? I don't think the rebuild I'm running is going to work as both drives have come up as unformatted and one of the drives looks like it's about to die. So is it possible to remove both drives from the array at the same time? I realise that the array will lose access to whatever data was on the drives but with the Linux viewer I might be able to recover some or all of it.
  21. I really think I need to kill this rebuild now. The drive being rebuilt appears to be OK but the 2nd unformatted drive is just drowning in read errors now. I'm a bit confused since the drive showing all the read errors is not the one currently marked as being rebuilt. Is it because the drive is being read as part of the rebuild process on the other drive? If I kill the current rebuild and replace only the drive throwing read errors will the rebuild process on the other drive kick off again upon a restart of the array? I realise I need to preclear any replacement drive first, all of which is going to take some time. Have attached another SMART report on the failing drive, I don't think it's long for this Earth. J3DSmartDataV216062014.txt
  22. This deadline was only because the drives are both just out of warranty but the seller has agreed to submit them for RMA for me. They said they need them brought in fairly soon so as to make their best case to the distributor. I have 1 drive I can use here (not part of the current array) but it requires preclear/formatting. Should I stop the current rebuild and swap the one throwing all the read errors out? The data isn't precious family memories and could be retrieved from other sources if needed but would be messy and yes time consuming. OK thanks for confirming, think I should concentrate on restoring the current setup first. It's pretty plain from the logs that it's having issues. I have attached a zipped copy for you, it's not pretty. It's just an extract as the full dump was too large to attach even zipped. Console is showing over 232,000 errors on that drive now, logs list lots of "handle stripe read errors". Have also attached the current SMART status for this drive throwing read errors (file labelled J3DSmartData), it's reporting imminent failure. Have also attached the current SMART status for the drive listed as currently rebuilding (file labelled 71DSmartData), it seems to be OK. I did notice that the Spin Retry Count has reset to 0 for this drive, yesterday it was reporting 4 (possibly power related perhaps?). syslog-extract-2014-06-16.zip J3DSmartData16062014.txt 71DSmartData16062014.txt
  23. I think Disk4 is rather unhealthy, it is now showing in excess of 15,000 errors in the main status window. The server houses our media collection but I can't be sure which parts of it are on each individual drive as I haven't forced the shares to use particular drives. I can't see the 2 problem drives as individual disks under Windows currently but I can see the overall share structure. While it wouldn't be the end of the world losing the data on those 2 disks a lot of time and effort has been spent on creating their contents, would rather not have to break the bad news to my other half. Can reiserfsck be run on a drive while the others are online? Have never used it before so don't know how it works.
  24. Update: After consulting the Wiki I decided to proceed with a restart of the array. The previously undetected disk is currently in the process of being rebuilt however the other disk that was coming up as unformatted has done that again (I guess the array has to be started for the format status to show up). Should I stop the rebuild process or just let it run and then deal with the unformatted disk?