Jump to content

Joe L.

Members
  • Posts

    19,010
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by Joe L.

  1. yes. Just make sure you use the latest version of the preclear script (1.13 is the newest at this time) Joe L.
  2. Your disk looks fine. There are only two items ni the SMART report worth mentioning: 9 Power_On_Hours 0x0032 061 061 000 Old_age Always - 34907 199 UDMA_CRC_Error_Count 0x003e 200 197 000 Old_age Always - 34 The first is the run-time-hours. (it has been in operation for about 4 years) The UDMA CRC errors are usually noise pickup from cables. (try NOT to be anal with cable management unless you use good quality SHIELDED cables. ) Do not tie-wrap SATA cables together and definitely not with power cables. The errors are not bad, but you should be aware of their cause. Lastly, I'd much rather trust an older drive such as this rather than a brand new un-tested drive. Good luck with your test server.
  3. The jumper ONLY applies for an EARS drive, yours is a different model, it is an EARX drive. No jumper is used with that drive, ever. (so you are good for that) If the drive can not pre-clear, try a different cable, a different disk controller port, and if nothing works, RMA it. You can use the -A option. It is as good as any for your drive, although it would work just fine with either partition alignment. The setting has absolutely nothing with the inability to be cleared. As far as how to proceed, use the older disk that was able to be pre-cleared. Joe L.
  4. It means the disk is timing out when communications to it are attempted. It could be a bad disk, or a bad disk controller, or a poor power supply, or poor quality splitter/drive cage connections. Notice there are TWO disks involved. They may share a common controller, or one might be causing the lock-up of the other sharing a disk controller. Some-times it is just the drive that is confused and a power cycle will fix it, other times it will not. get a smart report of the drives involved. smartctl -a /dev/sdi smartctl -a /dev/sdh
  5. On the windows machine, open file-explorer and navigate to \\tower\flash\preclear_reports Then just copy the reports desired.
  6. Seek_Error_Rate = 44 44 30 near_thresh 790278696138 I would only be concerned with this parameter, since the normalized value seems to be getting close to its failure threshold, and odds are the starting value was 100 or 200. The number of re-allocated sectors did not change, and that is good, but the number 343 is very high, and most people would RMA the drive based only on the number of re-allocated sectors. Since the seek error rate is iffy, and the re-allocated sector count high, I'd RMA. (the other parameters that are near their thresholds just have very high thresholds... they are not an issue)
  7. nothing to be alarmed about. Both the "short" and "long" tests are automatically aborted if the drive is spun down.
  8. I would not set cache_pressure to zero. I would in fact, set it closer to 100 as an experiment to see if it helps you. Obviously, we cannot control inodes cached vs. data blocks. (it would be nice, but that detailed control is not there0 Joe L.
  9. does it show up in a process list? Remember, there are background processes performing the "find" commands. It will not completely self-terminate until it finishes the "find" it is currently performing. Type ps -ef | grep cache_dirs to see if it is still running. (the -q command indicated it killed process id 2395.) lsof seems to indicate there is a process id 32681still active.
  10. cache_dirs has parameters to limit the number of levels of directories cached. It also has parameters to exclude specific directories. It sounds as if you need to either: A. install more RAM B. limit the directory depth cached. C. limit (include or exclude) specific user-shares to limit the directories cached. D. use the min-time and max-time parameters to set the min and max time between"find" commands in cache_dirs. E. modify cache_dirs as it suits YOUR needs. (It is just a shell script after all) F. stop using cache dirs. Apparently, your limited memory and high number if directory nodes COMBINED with other use of memory on your server cause the cache_dir "find"command to take too long and end up causing the directory entries in memory to be freed to be re-used by other processes accessing the disks. Joe L.
  11. So for some reason I logged back in to my box and I only have two out of three Screen sessions and I initiated them all via telnet so none on console. I ran the -t on the drive I don't see in my other existing sessions and it says precleared however I don't have a report. So what should I do next. Results below: Pre-Clear unRAID Disk /dev/sda ################################################################## 1.13 Device Model: ST3000DM001-9YN166 Serial Number: W1F0RSQZ Firmware Version: CC4B User Capacity: 3,000,592,982,016 bytes Disk /dev/sda: 3000.5 GB, 3000592982016 bytes 255 heads, 63 sectors/track, 364801 cylinders, total 5860533168 sectors Units = sectors of 1 * 512 = 512 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sda1 1 4294967295 2147483647+ 0 Empty Partition 1 does not end on cylinder boundary. ######################################################################## ========================================================================1.13 == == DISK /dev/sda IS PRECLEARED with a GPT Protective MBR == ============================================================================ Thanks! Neil I would look in /boot/preclear_reports
  12. Easy, the third phase is reading AND verifying that all bytes read are all zeros (they were written as zeros in the second phase). The first reading phase sends all the data read to /dev/null, and no analysis is performed. It does not care about what is read, other than to allow the disk's SMART firmware to identify un-readable sectors. It can run much faster. The verification in the third phase makes it run at about half the speed.
  13. Anyone? type preclear_disk.sh -? You'll see all the options. You can try the "-t" option to see if the pre-clear signature was written to the disk. That would indicate it did all but the post-read phase. Yes, you can skip the initial pre-read. Use the "-W" option. Joe L.
  14. you look in /boot/preclear_reports If it completed, the reports are there.
  15. I did. kill -9 did not kill it. If a process is in a system call waiting on a kernel level interrupt, no signal can kill it. (not even kill -9) Typically, this is due to some kind of driver bug and a deadlock situation in the kernel. All you can do at that point is what to did... reboot.
  16. unformatted indicates not mounted. It could be as a result of a poor shutdown and the transactions on the drives were being replayed. That can take as long as 30 minutes or more. If you had waited, the drive might have mounted itself and then shown as expected. Unless you know WHY the drive was not mounted, you have no reason to RMA the drive. If it happens again, capture the syslog for analysis.
  17. You stop the array,assign the drive to an unRAID slot, then when you start the array it will present it as unformatted and show a "Format"button. Pressing it will format the drive and allow it to be used in the array. If not pre-cleared, and added to an array with established parity protection, the drive would first be cleared by unRAID. this clearing step takes many hours during which your array is off-line. It is one of the main reasons the drive preclear script was developed, to eliminate the lengthy down-time. (lime-tech used to sell pre-cleared drives, but found it impossible to compete with newegg, etc. ) When the pre-clear script is complete, a special signature is written to the drive to allow it to be recognized as pre-cleared.
  18. Well, you used the "-n"option. -n = Do NOT perform preread and postread of entire disk that would allow SMART firmware to identify bad blocks. Therefore, it only wrote to the disk. You asked it to not bother to locate any unREADABLE sectors. It therefore could not re-allocate any that were un-readable. (since it does not yet know they are un-readable... it has never tried to read them) Normally, the disk is fully read first, to allow it to determine if any sectors are un-readable. Normally it is subsequently re-read AFTER zeroing to ensure the zeros were properly written. You elected to skip both those steps when you used the "-n" option. the portions of the process you elected to skip are why the disk finished so quickly. Good luck. The drive may be added to an existing array without it clearing the drive, but you might want to monitor it more closely as you load it with data. At the least, perform a full manual parity check once installed. (that will read it in its entirety)
  19. For disks that will have a GPT partition defined by unRAID a "protective" MBR partition is put into place to fool other older utilities into thinking the entire disk is allocated even if they have not been upgraded to handle GPT partitions. (most older programs have not) I suppose that the next time I update the preclear script I can describe that "protective" partition better. The protective partition starts on sector 1 and goes for the entire drive possible, using the bits available in the MBR structure. Since there are only 16 bits, the biggest disk partition the old style MBR can define is 2.2TB. The protective partition is therefore defined as 2.2TB. That protective MBR is not actually used on GPT formatted drives... Think of it as a placeholder to make older utilities and programs happy. The actual GPT partition is always 4k-aligned and defined in an area further past the first 512 bytes in the disk where the MBR is located. It utilizes the entire 3TB drive. Joe L.
  20. looks great. No sectors marked for re-allocation, and none re-allocated. Nothing else failing or near failing. Joe L.
  21. The first preclear showed that: 1. The drive had failed SMART at some point in the past on an over-temperature threshold. (It was currently within temp limits) The drive started with 3 sectors that had been re-allocated, and with 517 that were pending re-allocation. The pre-clear then read the entire drive... allowing the drive to detect any additional sectors it could not read. It did not find any, so there were 517 sectors that were pending re-allocation. The pre-clear then write zeros to the entire drive. The disk FIRST tried re-writing the original location on the disk platter rather than re-allocation. That apparently worked for 509 sectors. (or rather, after the post-read, there were only 8 sectors pending re-allocation, so apparently 509 were re-written in place rather than re-allocated since no additional sectors had been re-allocated) Every subsequent pass seems to detect a small number of un-readable sectors, and apparently they always seem to be able to be re-written in place. No additional re-allocated sectors are occurring. I would not trust the drive. (unless you are using it on a marginal power supply, in which case, it might just be reacting to the voltages it is being fed, but regardless... unless something changes, odds are you cannot trust it.) Good luck with RMA'ing it. Some-times manufacturers like to reject a claim when the drive has been subjected to an over-temperature environment.
  22. also, you can use hdparm -A /dev/sdg and fdisk -l /dev/sdg to learn more about the specific device. (odds are VERY HIGH) /dev/sdg is not the disk you think it is)
×
×
  • Create New...