December 18, 201411 yr Hi, Version: 5.0.6 Syslog file attached I've been using Unraid for a number of years without any hitches. All of sudden yesterday overnight something appears to have happened to my cache drive. When I access the unraid admin screen the cache drive shows a green ball next to it but in the free column it states 'unformatted'. Nothing is stored on the cache drive so I'm not concerned about data loss, but I have my apps setup on there: Plex Media Server - PhAzE plugin CouchPotato - Unplugged PLG SABnzbd - Unplugged PLG SickBeard - Unplugged PLG I am unable to access these apps (apart from Plex, but none of my library is there) All data on my other drives appears fine. When I go into MC via Telnet I can see the cache drive under /mnt/cache. I have tried stopping the array, unassigning the cache drive and then restarting, stopping and reassigning and then restarting but this hasn't changed anything. When I start the array in maintenance mode the cache drive still appears green. I have done a clean powerdown and reboot and this doesn't change anything. As I have nothing on the cache drive data wise I'm happy to format the drive if that's the fix, but ideally I would like to keep all the settings I had for the apps rather than have to set them all up again. Any advice is welcome syslog.zip
December 18, 201411 yr I would consider removing that cache drive & try to get your settings off it another way. but replace the cache with another & see what it does. that way nothing may be lost.
December 19, 201411 yr In the syslog, your Cache drive does not look good at all, certainly should not be a green ball. Time to examine a SMART report for it. It does appear to be communicating, so I think it will give you a SMART report. The errors are serious, but a little confusing (it's in IDE emulation mode), indicating both bad sectors and seek issues. Seek issues are fatal, and the syslog shows that it thinks the drive has "possibly failed". Bad sectors at the very beginning of the drive are also quite serious, because you cannot have them at the beginning of the file system area. You can work around them elsewhere but not at the beginning. But bad sectors may be a consequence of a failure to seek. Can't seek, then you can't read either, and drive is junk. The SMART report should tell us more (please attach a copy for us). If the drive is still usable, the next step will probably be to Preclear it, and results of that should help you decide what to do with it. One recommendation, I noticed that you have IDE emulation turned on, for some of your SATA drives, in particular the Cache drive. When you next boot, go into the BIOS settings and look for the SATA mode, and change it to a native SATA mode, preferably AHCI, anything but IDE emulation mode. AHCI mode is a little safer, a little faster, and it reports the errors in a way I'm more familiar with.
December 20, 201411 yr No seek issues, so the drive IS still usable. It's got some age on it (55000 hours), 35 reallocated sectors (you can live with that, so long as it does not continue to grow), one Pending sector (has to be fixed), and a bad sector (which is probably the Pending sector). The SMART report confirms the bad sector at LBA 192, which is apparently right in the heart of the Reiser file system, and disallows the system from reading the superblock, which means the drive cannot be mounted. Un-assign the drive and Preclear it. That *should* take care of the bad sector(s), and prepare the drive to be put back online, if it completes and tests OK.
December 20, 201411 yr Author Thanks for the reply Rob. I'm running a preclear on it now using Screen. Will confirm when it completes.
December 21, 201411 yr Author Hi Rob, The preclear has now finished. I have a feeling it wasnt successful though looking at the report at the end. (Attached for reference). Should I try and mount this back in the array or do I need to replace this drive? preclear_report.txt
December 21, 201411 yr Hi Rob, The preclear has now finished. I have a feeling it wasnt successful though looking at the report at the end. (Attached for reference). Should I try and mount this back in the array or do I need to replace this drive? Did you set the drive to AHCI ?
December 21, 201411 yr Author @Helmonder - No I didn't. I didn't realise that was necessary before the preclear, was planning on doing it after as my machine is headless and need to setup monitor so I can get into the bios. Do I need to run the preclear again after changing to AHCI?
December 21, 201411 yr Author Ok, so I got a monitor setup and changed the sata configuration to AHCI but it then won't boot to unraid. I changed it back to IDE in the sata configuration and unraid loads fine
December 21, 201411 yr Do I need to run the preclear again after changing to AHCI? No. It will run OK in the current mode, it will run a little better in AHCI mode. The SMART numbers are essentially unchanged. In particular, the current pending sector is still there. It's as if the zeroing did not occur. Can you grab the syslog that covers that Preclear and post it here? There should be errors reported in it. You don't want to put the drive back in service until there are no pending sectors. Ok, so I got a monitor setup and changed the sata configuration to AHCI but it then won't boot to unraid. I changed it back to IDE in the sata configuration and unraid loads fine Just saw this. What errors are you seeing? It should work the same.
December 21, 201411 yr Author @Rob Syslog attached. When I switched it to AHCI and it tried to boot it just remaining on a black terminal screen and never loaded anything, the cursor was refreshing so hasn't crashed. I left it for 20 mins or so but nothing happened. I then turned the machine off and switched it back to IDE and it loaded fine to the unraid boot screen first time. syslog-21-12-14.zip
December 21, 201411 yr Is boot from flash failing, or is it not trying to boot from flash? Maybe just need to check bios again to choose boot device.
December 21, 201411 yr Author @trurl - I will double check but the only setting I changed was the sata configuration to AHCI rather than IDE - nothing about the boot device. As soon as I changed it back to IDE it booted fine without any other changes
December 21, 201411 yr On some machines, after making any change to the drive setup, the BIOS will try to 'help' you by resetting the boot order. Do make sure your flash drive is still set to be the first to boot. However, you should have seen some sort of boot error, so that may not be the problem. If you can't get AHCI to work, then you'll have to live with the IDE emulation. Unfortunately this syslog did not have the Preclear in it (it covers the period of 2:38pm to 2:56pm).
December 21, 201411 yr Author Ok managed to get AHCI working now - as @RobJ said my BIOS was trying to be helpful and resetting the boot device. About the syslog - I didn't realise it reset on reboot. I assume there's no way of getting the logs from when I did the preclear over the last 24 hours or so? All I have is the report text file which I posted earlier. I copied this information from telnet when the preclear had completed. For the preclear I followed the instructions on the configuration tutorial and cleared using screen and the following command: ./preclear_disk.sh -A /dev/hdd I've precleared drives before doing exactly that so I'm pretty sure I have done that right. What would you recommend next? I can run preclear again on the drive, but as it will take over 24 hours to run again, I'd rather go down other routes if they are available now I've switched to AHCI
December 21, 201411 yr That's great (the AHCI and booting part)! In AHCI mode, it will be treated as a true SATA drive so the drive symbol will be new, use preclear_disk.sh -l to see it. Then I would recommend a badblocks pass. It will take a VERY long time, but in some ways be more thorough. Try badblocks -b 4096 -wsv /dev/sdX, changing sdX to the new drive symbol (it's not hdd any more). Make VERY sure you use the right symbol, because this badblocks command will totally wipe out any data on that drive!
December 21, 201411 yr Author Thanks RobJ - will do, a couple of questions: 1 - Will I be able to use the array when the badblocks is running? 2- Should I run this in Screen so I can use the array? 3 - When you say it takes a very long time what are we talking? The preclear took approx 25 hours
December 21, 201411 yr Thanks RobJ - will do, a couple of questions: 1 - Will I be able to use the array when the badblocks is running? Yes. 2- Should I run this in Screen so I can use the array? Good idea, either with Screen or on the physical console. 3 - When you say it takes a very long time what are we talking? The preclear took approx 25 hours I don't actually know. I suspect it's going to take MUCH longer than the Preclear, if it does the full 4 passes.
December 21, 201411 yr Author @RobJ - I am running the badblocks as discussed but I found the attached preclear report files on the flash drive - I didn't realise it created these on the flash drive - I'm hoping these might help you figure out why it said the preclear was unsuccessful preclear_finish__WD-WCASJ1297981_2014-12-21.txt preclear_rpt__WD-WCASJ1297981_2014-12-21.txt preclear_start__WD-WCASJ1297981_2014-12-21.txt
December 22, 201411 yr The preclear was not successful because the post read found non-zero bytes on the drive, should have been all zeroes after the zeroing, which is another clue that the zeroing was not successful. The zeroing should have also caused the drive to deal with the bad sectors, and didn't. The before and after preclear reports are surprisingly similar, except for one significant detail - the ATA error count. It's normally zero on most drives, but had risen to 32 before the preclear began, the last 5 of which indicated the one known bad sector at 192. The final SMART report however indicated the error count had risen to 5512! That's an incredible increase and very alarming. The last 5 shown are all UNC's (UNCorrectable sector) at sector zero, also somewhat alarming. This drive is looking less and less recoverable. I'm curious as to how the badblocks pass is progressing, lots of errors or unexplained delays?
December 22, 201411 yr Author @RobJ - thanks for looking at that. The badblocks was going fine (first 2 passes complete no errors - unless it specifies them at the end - the 3rd pass was nearly complete but I stupidly tried to copy the text from the putty client and it stopped the process! :-( I'm rerunning it now and will let you know how it goes
January 20, 201511 yr Author @RobJ - long delay over Christmas but finally got time to run the badblocks through - it has completed all the passes and found 0 bad blocks. I tried adding the drive back to the array as the cache drive and it still shows unformatted but with a green ball. I have now run the format on the drive and it appears to be ok now, will let you know if I get any more issues - thanks for all your help
Archived
This topic is now archived and is closed to further replies.