Joe L.
-
Posts
19,009 -
Joined
-
Last visited
-
Days Won
1
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by Joe L.
-
-
All disks have raw-read-errors, some report them, some do not.Is this disk unsafe to use, with the Raw_Read_Error_Rate being so high?
The failure threshold of 006 for that parameter in SMART is not close to the current normalized value of 116, however the "worst" normalized value has been as low as 99, so I'd keep an eye on the disk over the next months/years. It looks great otherwise.
1 Raw_Read_Error_Rate 0x000f 116 099 006 Pre-fail Always - 105202000
The normalized value actually improved from 114 to 116 over the course of the preclear. That is a good sign.
-
good point.]It will NOT find the un-readable sectors when using that option, so just be aware of the risks.
(Far more likely to suffer read errors (un-correctable-media-errors) once put online)
True -- however, if it's a "known working drive" that's been in use and is known to be in good shape, it's MUCH faster to (a) do a preclear with the -n parameter; ((b) add it to the array; and then © run a parity check => something you should do anyway when adding a new drive, and that WILL find any unreadable sectors
-
It will NOT find the un-readable sectors when using that option, so just be aware of the risks.Run it with the -n parameter and it will only zero the drive and write the signature. I do that all the time when I transfer drives from one unRAID server to the other.
Thank you!
(Far more likely to suffer read errors (un-correctable-media-errors) once put online)
-
In the latest 5.X releases the array must be started for preclear to find the assigned disks. Yes, known issue with the latest unRAID. It use to keep a "readable" file with the disk assignments in the /boot/config folder, apparently, it no longer does.This may have already been mentioned.. but I can't seem to find it with the search..
Just got a new drive and went to pre clear it.. ran pre clear_disk.sh -l and it is showing ALL my disks, not just the unassigned ones. Is this a known issue or is there something wrong that might cause this??
Cheers,
whiteatoms
MAKE SURE YOU KNOW THE DISK YOU WILL BE CLEARING BY THE SERIAL NUMBER...
Joe L.
-
I agree, the preclear script should do better in detecting the complete failure of a drive, it is not easy however, as the "dd" commands used do not always give clear indications of a failure. (preclear does not look in the syslog)RobJ,
Thanks very much for the insight!
Yeah, this drive is a little wonky, I'd only had it for a few months, running in my MBP, and all of a sudden it started taking FOREVER to read. I was watching a movie with XBMC and it stuttered every few seconds. I couldn't agree more on not using it in the array, I was just hoping that the combined stress level of preclearing and the drive's own mechanisms for fixing its problems would make it useful for something besides a coaster. (now there's an idea, old drives encased in lexan for drink coasters? who doesn't need a frosty adult beverage whilst constructing the perfect parity-protected server?!)
I'm running the fsck now, but I do have a few more questions.
If the kernel detected a lost packet, why did the 3TB drive (md12) not redball? As you see from the dates, this was a couple of days ago, and the array still showed all green after the issue.
What could, and I know this is probably all guesswork, cause the 1TB drive to "fail" in this manner?
And thirdly, I'd like to second the respectful request to Joe L. to take a look at the preclear script. These don't look like 'PASS'es to me! Don't get me wrong here, I, like probably everyone else on this forum am indebted to Joe multiple times over for his advice and unmenu and everything else. Just a question/bug(maybe).
Thanks!
Pengrus
p.s. Oh, i forgot one more...is there any way to tell what "command c21f9f00" is?
-
I'll check it out. I have a server running 5.0rc16c. Thanks for the report.Hi Joe, found a quirk. Running unRaid 5.0-rc16c and preclear 1.13.
In the unRaid disk settings, set Enable Auto Start to No, then reboot. Run preclear_disks.sh -l.
All disks, whether assigned to the array or not, are listed as available for clearing.
Start the array. Only the correct disks are listed for clearing.
Stop the array. The correct disks are still listed.
-
I wrote it, and I would not even guess. I would not use more than -b 2000 personally. You are asking it to read in 1.6 GB chunks.I'm about to buy another 2 4TB drives and intend to preclear them both at the same time.
I've got a machine that can be dedicated just to preclearing and it will be running virtually stock unRAID 5.x apart from the addition of a few small-footprint apps like unMENU and screen.
It has 3GB of RAM. The last time I precleared a single 4TB drive, on a machine running a ton of add-ons, but with 8GB of RAM, I was getting lots of memory errors (but no crash of the preclear process). I imagine that's because I was running out of lowmem while using the other apps.
The command line I used was:
preclear_disk.sh -r 65536 -w 65536 -b 25600 /dev/sdb
Can someone please advise what would be the optimum or suggested command line parameters to preclear 2x 4TB drives with 3GB RAM?
I guess you'd want a combination of speed + memory safety.
Cheers!
65536 * 25600
-
ALL disks have hardware error correction errors, some report them some do not. It is normal. The important thing to check is if the "normalized" value is above the affiliated error threshold. If it is, the drive is working as expected.
All that said, the current normalized value is 031, the worst value it has been is 006, and the failure threshold is 0. Keep an eye on the disk, if the value continues to drop low into the single digits, it might be an indication of a drive nearing the end of its life. (it has been spinning for 16127 hours)
-
tr = "translate"Nothing at all wrong with the zip file. It has no ms-dos carriage returns in it. (I just downloaded it and un-zipped it on my server)
Hmmm, I wonder why it's doing it on my system? Thanks for checking it out! At least it gave me the opportunity to learn about tr (truncate)!
-
Nothing at all wrong with the zip file. It has no ms-dos carriage returns in it. (I just downloaded it and un-zipped it on my server)Not that I'm aware of... But which download link are you referring to?
Thanks Joe L. for writing! The link is the bottom of your original post in this thread, shown as cache_dirs.zip:
http://lime-technology.com/forum/index.php?action=dlattach;topic=4500.0;attach=14315
I appreciate the help!
-
Not that I'm aware of... But which download link are you referring to?So, is there a problem with the file format of the file in the download link?
-
I do not know the default...
Incorrect, if no option is specified, then the unRAID configuration preference setting is used.I thought that the latest version of the pre-clear script would, when run on a system with UnRAID v5, automatically always use sector 64 => but perhaps that's not the case.
Look under
Settings->Disk Settings->Default-partition-format
to set your default preference.
Thanks for the details Joe. But just for clarification, isn't it true that the default setting for v5 is 4K aligned? ... in which case I'd think you don't need the -A switch (with v5) UNLESS you've changed that setting. Correct?
-
Incorrect, if no option is specified, then the unRAID configuration preference setting is used.On my second preclear I forgot to add the -A switch when executing the command. So now I have my first disk precleared with a starting sector of 64 and the second one with a starting sector of 63. Any issues here?? Do I need to run another preclear on the 2nd disk with the -A option?
I thought that the latest version of the pre-clear script would, when run on a system with UnRAID v5, automatically always use sector 64 => but perhaps that's not the case.
Look under
Settings->Disk Settings->Default-partition-format
to set your default preference.
One more thing...
Unless your disk is a WD "EARS" drive with the firmware that performs poorly when partitioned to start on sector 63, then odds are you would not notice any difference in performance at all regardless of where the partition starts. (and even that drive would work just fine with the partition starting on sector 63 unless you were really anal about performance) You likely could have left things as they were.
-
CorrectI'm in the process of pre-clearing several new 4TB Seagate NAS drives, and have the following results from the first 2 drives (the two sets of results are nearly identical, so I've only listed one, as it shows what I'd like to know).
I'd appreciate some feedback on the results. My thoughts/questions are as follows:
(1) The Raw Read Error Rate actually shows improvement (from 100 to 110), so I assume it's fine, and I should simply ignore the raw value. Is that correct?
anything with a current normalized value within 25 of its affiliated failure threshold is "near threshold" For some attributes, the manufacturer sets the failure threshold within a count or two from the factory initial value.(2) Both the Spin Retry Count and End-to-End Error values are the same (100 before, 100 after), but the status shows "Near Threshold". Any comment on these? ... or are they fine and "ignorable" ??
same above.(4) The temps seem fine. Not sure why the airflow is shown as "Near Threshold", but I don't see anything to worry about -- agree?
-
From the SMART Wiki:
Where's the report showing the changes?It's my hard drive ready?
The high-fly-writes went from 100 to 75 during the pre-clear. Although nowhere near the failure threshold, keep an eye on it in the next few months.
Thanx Joe .... I will add it to the array now to replace another failing hd.
What is a high-fly? should I be worried? I will keep an eye on it the next few months. Sorry here is the report forgot there where 3 files.
High Fly Writes
HDD producers implement a Fly Height Monitor that attempts to provide additional protections for write operations by detecting when a recording head is flying outside its normal operating range. If an unsafe fly height condition is encountered, the write process is stopped, and the information is rewritten or reallocated to a safe region of the hard drive. This attribute indicates the count of these errors detected over the lifetime of the drive.
This feature is implemented in most modern Seagate drives[1] and some of Western Digital’s drives, beginning with the WD Enterprise WDE18300 and WDE9180 Ultra2 SCSI hard drives, and will be included on all future WD Enterprise products.[20]
-
Where's the report showing the changes?It's my hard drive ready?
The high-fly-writes went from 100 to 75 during the pre-clear. Although nowhere near the failure threshold, keep an eye on it in the next few months.
-
I did some research. The number after the "+" is the number of partial blocks read or written. Since the pre-clear script periodically sends a query to the process actually writing a disk that is using "dd", and has no way to synchronize the request, it is possible to get a response while in the middle of writing a "block" of data. For that reason you'll see the +n increment over time. It is just a measure of how many times you managed to ask its percentage complete when it was in the middle of writing a block of data.They are just the statistics reported by the "dd" command. I never cared what they meant (was more interested in the number of blocks read). Probably indicates some partial blocks of bytes. (nothing to worry about) You would have to look up the "dd" command to learn more.
Joe L.
Joe L.
-
They are just the statistics reported by the "dd" command. I never cared what they meant (was more interested in the number of blocks read). Probably indicates some partial blocks of bytes. (nothing to worry about) You would have to look up the "dd" command to learn more.
Joe L.
-
The errors you are getting are consistent with a drive that has stopped responding to commands.here is the output
root@Tower3:/boot/scripts# hdparm -i /dev/sdh
/dev/sdh:
HDIO_DRIVE_CMD(identify) failed: Input/output error
HDIO_GET_IDENTITY failed: No message of desired type
I think UNRAID has an issue with the Supermicro AOC-SASLP-MVL8. Is this SATA addon card supported? I thought it was or I would not have spent over $100 on it.
Can anyone confirm that the Supermicro AOC-SASLP-MVL8 is supported in UNRAID?
I posted a syslog of startup with the SATA board installed. Can someone confirm it is ok?
http://lime-technology.com/forum/index.php?topic=28406.0
It could be the disk itself, or a cable to it, or a disk controller port. It may even start responding again if power cycled. (it is locked up by buggy firmware on the disk)
In any case, you can isolate the issue by re-seating/replacing the cables to the drive, trying a different disk controller port, etc.
-
looks to me like the drive is not responding at all.I am wondering if this Supermicro AOC-SASLP-MVL8 is causing me issues. They are supposed to be great SATA controllers.
I tried this:
smartctl -a -d sat /dev/sdh | todos >/boot/data/smart_report_disk6.txt
Gave me this:
smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)
Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF INFORMATION SECTION ===
Device Model: [No Information Found]
Serial Number: [No Information Found]
Firmware Version: [No Information Found]
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: [No Information Found]
ATA Standard is: [No Information Found]
Local Time is: Tue Jul 9 20:56:20 2013 EDT
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
Have you tried
hdparm -i /dev/sdh
-
Me too. Looks like the disk might not be responding to a smartctl report request.I have seen the pre_clear script produce unexpected results if the disk goes off-line during the pre-clear process.
-
Smething happened prior to the preclear to mark those sectors as having a checksum at the end of the sector that did not match the contents of the sector.Thanks very much for the rapid reply.
I think I'll run another pre-clear on that drive, and then monitor it closely.
As already mentioned, they were re-written in place, and the contents then matched the expected checksum for those 7 sectors.
A subsequent preclear cycle will tell you more. With any luck you'll not see any additional un-readable sectors
(all checksums at ends of sectors will match the contents of their affiliated sector)
Joe L.
-
Looks great!!!If it completed it would have logged the results in the /boot/preclear_reports directory.
Ah-ha!
========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdb == ST4000DM000-1F2168 W3009TM3 == Disk /dev/sdb has been successfully precleared == with a starting sector of 1 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 9:32:25 (116 MB/s) == Last Cycle's Zeroing time : 8:22:56 (132 MB/s) == Last Cycle's Post Read Time : 19:03:40 (58 MB/s) == Last Cycle's Total Time : 37:00:07 == == Total Elapsed Time 37:00:07 == == Disk Start Temperature: 29C == == Current Disk Temperature: 36C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdb /tmp/smart_finish_sdb ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 118 100 6 ok 182808288 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 Airflow_Temperature_Cel = 64 71 45 near_thresh 36 Temperature_Celsius = 36 29 0 ok 36 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================
Joe L.
-
If it completed it would have logged the results in the /boot/preclear_reports directory.
cache_dirs - an attempt to keep directory entries in RAM to prevent disk spin-up
in User Customizations
Posted