June 22, 201016 yr I performed a parity check after moving a bunch of movies to and from the server and it told me it had 256 errors. When I look at the unRAID menu the only drive that show errors in the error column is the parity drive which shows 85 errors. Can someone take a look at the attached .txt file and tell me if I'm in trouble... this is the first time I've had errors and I've only had this running for a few months. It is to be noted I perform all folder transfers through Windows via network shares. By the way it shows the parity as being Valid. syslog-2010-06-21.txt
June 22, 201016 yr There are definitely errors on your parity drive. Check all cables on that drive. I would suggest you grab a smartlog smarctl -a /dev/sd? where ? = your parity drive. Example. echo status > /proc/mdcmd ; strings < /proc/mdcmd | grep rdevName.0 rdevName.0=sdk smartctl -a /dev/sdk > /boot/smarctl.sdk.beforetest Upload that log here. You can try and run smart tests. smartctl -tshort /dev/sdk wait 5 minutes. smartctl -tlong /dev/sdk Wait about 2-5 hours depending on text given back. Do not use array while this is going on. Afterwards do. smartctl -a /dev/sdk > /boot/smarctl.sdk.aftertest and compare logs or upload them.
June 22, 201016 yr Before you do the smartctl commands, be sure to disable any and all SpinDown timeouts; in unRAID set it to 'NEVER'.
June 22, 201016 yr Author Thanks both of you. I will start this when I get home from work. I won't be transfering any more files till I get this figured out. This wouldn't have anything to do with the fact I don't have the jumper on the back of my EARS drives would it? I plan on ordering another drive asap as a backup just in case (probably not a WD green).
June 22, 201016 yr This wouldn't have anything to do with the fact I don't have the jumper on the back of my EARS drives would it? I plan on ordering another drive asap as a backup just in case (probably not a WD green). If your parity drive does not have the jumper, then it should just be slow, but not have errors like this. Do the smart commands get a frame of reference. If you wanted to, you could take the parity drive out of the array and do a preclear on it. I would really suggest doing the preclear somewhere else so you do not make a mistake. If you do decide to preclear you can set the jumper (or just wait for your new drive and do it all right from the get go). I know with my 2TB WD drive. it was brand new and after the preclear there were reallocated sectors (but no pending sectors).
June 22, 201016 yr Author Ok, just so we are on the same page I'm a novice at best when it comes to command line logic so feel free to talk down to me and dumb it up if necessary. Since my parity drive is "a" would all the areas you put "k" in sdk be "sda" or is the sdk a legitimate command? I will be using putty to perform all this. So... first perform: smarctl -a /dev/sda or echo status > /proc/mdcmd ; strings < /proc/mdcmd | grep rdevName.0 rdevName.0=sda smartctl -a /dev/sda > /boot/smartctl.sda.beforetest I really am sorry in advance if this becomes frustrating to explain due to it being over my head. I will be searching wiki to help teach myself some of this but I was honestly no expecting trouble this early on...computers...... Thanks for the help!!
June 22, 201016 yr -- Since my parity drive is "a" would all the areas you put "k" in sdk be "sda" That is correct. First perform the following two lines. the ; character in the other message stacks the two commands. So to simplify, just enter two commands. echo status > /proc/mdcmd strings < /proc/mdcmd | grep rdevName.0 That should return rdevName.0=sda If it does, then you know parity is /dev/sda Then perform smartctl -a /dev/sda > /boot/smartctl.sda.beforetest This will capture your smart log before any testing to your flash drive. to view it do more /boot/smartctl.sda.beforetest
June 22, 201016 yr Author Ok, I performed the smart log and attached it. Thanks WeeboTech, your instructions were very easy. I didn't know where to find the log on the flash drive so I just pasted it into notepad. Edit: found the log on the flash so I will post the log of the aftertest when complete. smart_log.txt
June 22, 201016 yr I took a peek and saw 2 pending sectors. That means 2 sectors are unreadable and the next write to those sectors will remap them somewhere else. on my brand new drive, I did a short test and long test, then Joe L's preclear.sh Sure enough there were remapped sectors. I do suggest you schedule some time and stop the array. move the drive to another machine and do a preclear or "carefully" do a preclear on this drive being extra sure you do not accidentally pick the wrong drive or your data will be destroyed. If there are bad sectors they will be remapped. Then do a long test again. If it looks good, then reassign the parity drive and rebuild it. The drive does not look bad at this point, it's just not perfect. Many others are not either.
June 23, 201016 yr Author I understand. I performed the long aftertest log after the time it said it would take to see in the log 20% of test remaining...hope I didn't cause a hiccup. I have another drive on the way. It's a 7200rpm which I plan on replacing the current parity drive with. So either case I will have to perform a "careful" preclear on it. I will post the final aftertest log in the morning just to verify the findings. If its what you say at least I know what to look for and will be glad it's not too serious. No matter what I will be doing some serious research to avoid permanently deleting any info on a drive when performing the preclear (this is the only desktop unit). Thanks again for the help WeeboTech. smartctl.sda.aftertest.20%remaining.txt
June 23, 201016 yr I understand. I performed the long aftertest log after the time it said it would take to see in the log 20% of test remaining...hope I didn't cause a hiccup. I have another drive on the way. It's a 7200rpm which I plan on replacing the current parity drive with. So either case I will have to perform a "careful" preclear on it. I will post the final aftertest log in the morning just to verify the findings. If its what you say at least I know what to look for and will be glad it's not too serious. No matter what I will be doing some serious research to avoid permanently deleting any info on a drive when performing the preclear (this is the only desktop unit). Thanks again for the help WeeboTech. Many of the drives say to wait 255 minutes for the long test to complete. They lie, as it often takes quite a bit longer, especially with the 2TB drives. Figure on 4 or 5 hours. I think they only allocated a single byte for the minutes in their data structure and it is not enough to hold the true time needed. As far as the preclear_disk.sh script, it will not let you run it on a device assigned to your array, so you are relatively safe. It will not let you run it on any disk that is mounted, even if not part of the array, and lastly, it will ask you to confirm you have the correct drive before doing anything to it, so it is very hard to make a mistake.
June 23, 201016 yr Author Well that eases my mind some with regards to preclear. The aftertest log looks fine to me but to be honest I don't know what most of it means. I couldn't get notepad2 to export it the ways it looks in notepad2 so I created a .doc file. smartctl.sda.aftertest.doc
June 23, 201016 yr Well that eases my mind some with regards to preclear. The aftertest log looks fine to me but to be honest I don't know what most of it means. I couldn't get notepad2 to export it the ways it looks in notepad2 so I created a .doc file. It shows you have 2 sectors pending re-allocation. Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 They will be re-allocated when those sectors on the disk are written to. I'd run another pre-clear cycle on it. For those sectors to still be pending they would have to have been un-readable in the post-read phase. Joe L.
June 23, 201016 yr Well that eases my mind some with regards to preclear. The aftertest log looks fine to me but to be honest I don't know what most of it means. I couldn't get notepad2 to export it the ways it looks in notepad2 so I created a .doc file. It shows you have 2 sectors pending re-allocation. Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 2 They will be re-allocated when those sectors on the disk are written to. I'd run another pre-clear cycle on it. For those sectors to still be pending they would have to have been un-readable in the post-read phase. Joe L. Also if you were to re-generate parity, those sectors would be re-written, thus causing the remapping to occur. I would still do the preclear.
June 24, 201016 yr Author Sounds good. I get my new drive today and will be performing the preclear on it then rebuilding the parity. After that I will apply the preclear to this drive.
July 1, 201016 yr Author So I got my new drive in and performed the preclear. It gave me a message I wanted to see if others have gotten before. " Disk has been successfully precleared (took 25:13:30) S.M.A.R.T. error count differences detected after pre-clear. Note some 'raw' values may change, but not be an indication of a problem." Since it says "not and indication of a problem" I assuming it's ok. But can someone explain what happened just so I understand? I plan on swapping it in for my current parity drive that has errors. Just a note I had to do another parity check due to power outage and it had over double the errors as before.
July 1, 201016 yr So I got my new drive in and performed the preclear. It gave me a message I wanted to see if others have gotten before. " Disk has been successfully precleared (took 25:13:30) S.M.A.R.T. error count differences detected after pre-clear. Note some 'raw' values may change, but not be an indication of a problem." Since it says "not and indication of a problem" I assuming it's ok. But can someone explain what happened just so I understand? I plan on swapping it in for my current parity drive that has errors. The "raw" values have meaning only to the manufacturers, if they even show them. They vary from drive model to drive model, even within the same brand. They will change each time you perform a SMART report. The "diff" output is showing lines where the values changed, nothing more and nothing less. In fact, if a parameter is un-changed AND is failing, then it will not show in the output since "diff" only shows lines that are different. Granted, a true failure will probably show itself as problems in clearing the drive, so don't worry about that too much. All we can do is look at those parameters that appear to be humanly readable in the smart report (and the "diff" between the starting smart report and the ending one). For example, one parameter usually readable is the disk temperature. Typically, it increases during the corse of a preclear cycle, but... there are some drives where the "raw" temperature reported is below ambient. That is impossible, but only if you assume the number has been converted to centigrade. If it is really "raw" and must be interpreted using a conversion factor, then we cannot even use it to determine if a drive has overheated. Perhaps the number is supposed to be 10 degrees low, to allow a higher top temperature... unfortunately, there is no consistency between models and brands, only the disk manufacturer knows. With that in mind, I'll repeat: About the only RAW values you can interpret yourself are those for re-allocated sectors, sectors pending re-allocation, and drive temperature. (and the temperature might not be reported accurately at that) The other "raw" values may change and show up in the "diff" output by the pre-clear script. Unless you are the manufacturer, you can probably ignore them, as they have no meaning to you or anyone else other than the manufacturer, and they aren't telling. Your drive passed the pre-clear, and as long as you do not see an increase in the number of re-allocated sectors, or sectors pending re-allocation, you should be fine. Joe L.
July 7, 201015 yr I took a peek and saw 2 pending sectors. That means 2 sectors are unreadable and the next write to those sectors will remap them somewhere else. I have taken ownership of this drive which previously owned by HTLuver. The drive was originally installed as a parity drive without jumper (7-. I have scene then placed a jumper on it and conducted a pre-clear and installed it as a data drive. I had trouble getting it to clear and format. Why it went to clear I don't know. I then found there was a newer version of unRAID 4.5.6 which i updated to which it still wanted to do a clear after conducting a pre-clear. in any case after the clear, it displayed format option. Which I ran with no issue. I ran a smart log which now depicts 1 pending sector instead of 2 sectors. I have attached the last log of the drive. Can I assume this drive is OK to use based on the log? Thanks 7-7-2010.txt
July 7, 201015 yr I ran a smart log which now depicts 1 pending sector instead of 2 sectors. I have attached the last log of the drive. Can I assume this drive is OK to use based on the log? Thanks According to this in the SMART report: # 1 Short offline Completed: read failure 60% 841 2930277167 # 2 Short offline Completed: read failure 60% 841 2930277167 The "short" test you recently ran failed. It is probably the one sector pending re-allocation. Based on the history of the drive, I'd run a full preclear_disk.sh cycle on it before trusting it with data. If it cannot pass a "short" test on its own, and if different sectors keep showing up as un-readable, it is a good candidate for an RMA. Just keep records of the failures in case the RMA is contested. Joe L. Joe L.
July 7, 201015 yr OK Joe L. I'll run pre-clear again. (update 24 houurs & 17 min. and only @ 52%) I noticed that this time the read is avg. 10MB/sec (update 9MB/sec.) while the last time was 45MB/sec.. This is going to take awhile. The last time I got the smart log from unMenu. Is there a better way (newbee) to retrieve the data for drive sdc from the preclear process? I do have some experience (2 months) in conducting basic commands from putty and console. Good or bad I want to remove drive sdc from the array until I need it. I found in an attempt to do this that the restore button has been removed. Can I use in initconfig command to reset the array so I can achieve parity again without sdc ? Thanks in advance.
July 8, 201015 yr Can I use in initconfig command to reset the array so I can achieve parity again without sdc ? Yes. That is exactly what you need to use instead of the button.
July 10, 201015 yr Thanks Joe L. The longest pre-clear ever is completed. It looks now to be "0" on pending which was 2 originally then 1 after another preclear and now zero but still 60% on the Short offline description. See attached log. What the meaning of the 60%? Should I start working on an RMA? Thanks for you advice 7-10-2010.txt
July 10, 201015 yr What the meaning of the 60%? It indicates the short test aborted when it still had 60% of the test remaining to run. Should I start working on an RMA?Based on the "short" test, yes. But there is conflicting information in there are no sectors re-allocated, and none pending re-allocation. That indicates the sectors that could not be read were subsequently readable after writing to them, and no re-allocation was needed. (and that is good, so no RMA is needed, unless it is intermittent, in which case "Yes" ) I'd run a "long" test on the drive and see if it can pass. Joe L.
July 10, 201015 yr Thanks Joe L.. I figured you would want to see a long test but I closed the window on my laptop using unMENU. Well I start over (no spin down selected on drive) and it should be done around 1pm.
Archived
This topic is now archived and is closed to further replies.