Jump to content

Joe L.

Members
  • Posts

    19,010
  • Joined

  • Last visited

  • Days Won

    1

Everything posted by Joe L.

  1. Actually, no. But cumulatively on one boot, yes. I pre-cleared two 1.5T, then a 500G. For some reason it didn't show problems until the end of the 500G drive. I have 4G RAM installed (2.9G available to Linux), but I failed to notice how much was freed each time. Gotcha. Ok, thanks. I'll try that. --Bill Actually, yes... Your 4Gig of RAM (2.9Gig available for programs/buffers) is most certainly less than even the smallest of your disk drives (500gig) Your buffer cache used for disk I/O will be used at a rate of about 75-80 MB/second (however fast you are reading and/or writing to the disk). It is only freed when another program needs memory.
  2. Until the array is started, yes. There's no indication of what is new and cleared. But as soon as you start the array, the drives that were pre-cleared and newly installed show up as needing formatting. At least that how it was on the last two I just did with the 9.6 version of preclear_disk.sh. Why would it matter if there was a parity drive or not? It would not matter if there was a parity drive or not. Don't think I understand this. If the drive had something extra written to say, sector 2, and that was consistent throughout the life of the drive in that array, the parity check would always be consistent, wouldn't it? Why would you not be able to restore? Basically, it would... but not for the reason you think. Sectors 0 through 62 are unused, and not part of any partition, and NOT part of the protected data. The protected data in unRAID does not start until the first partition. (You can write anything you want to sector 2, but it will be gone when you restore.) The good news however, is the first 64k of a reiserfs partition is unused by reiserfs, and specifically for boot records, etc. So... you can write to /dev/md1 to its first sector (which is actually the first sector on the first partition) and you will then get it recorded on the disk, and protected by parity. Never write directly to /dev/sdX1 as it will cause a parity error. Oh yes. I can't tell you the number of times I've had to enter the old drive parms into a BIOS manually, in the (not so) 'good old days'. But we were in heaven then with the new 386-25 or 33Mhz motherboard and our 50-100MB full size drives... Screamers, you know. --Bill Oh yes... I go back a bit more than that... Do you remember 8" floppy disks? (or loading programs via punched paper tape on a teletype machine?) Talk about slow...
  3. It is using the cache, and linux will let it use as much as it can. It is not specifically "dd", but the fact that you are reading/writing every block of the disk being cleared, and each block is being held in cache in case you will be referencing it again shortly... It has no way to know the usage patterns of an unRAID server It just frees the least recently used buffer memory when it is needed and not currently being referenced. (And odds are pretty high that your disk size is greater than the amount of memory you have in the server, no matter how much RAM you have. ) You might set the cache pressure to allow it to re-use the buffer cache more quickly rather than to keep entries in the cache. sysctl vm.vfs_cache_pressure=100
  4. Hmmm. It seems obvious that I must have done something wrong in the sequence, but I'm quite sure I didn't format any drives. In fact I recall being surprised that the supposedly pre-cleared ones didn't show up needing to be formatted, since with all zeros just written there can't be a filesystem structure there. Do the disks show as available in the management interface? (with a size, and free space?) Not that I'm aware of, but then most people have a parity drive assigned... I don't know what happens if a pre-cleared set of disks is present with no parity drive assigned. I would have expected the "Format" button, but who knows. the "Format" button would format all disks that were assigned and unable to be mounted.... It is not necessary to do it for each disk in turn. No, all of the bytes in the MBR (the first 512) are significant, and are tested by unRAID. Interestingly, the remainder of the first cylinder (up to sector 62) are currently not used at all... for historical reasons that allow the disk to be recognized as partitioned in older BIOS and in Windows) So, you could write something there... as a "note." I'd just write something down on paper... If you accidentally made a drive look like it was pre-cleared, and it had anything other than zeros on the remainder of the disk, you would completely break all the parity calculations and be unable to restore from a disk failure unless you did a full parity check and fixed all the parity ... prior to the failure ... It has been a long time since the old C/H/S geometry worked as originally defined.... In fact, it could not handle disks > 8 Gig. (At that time, 8 Gig was a dream... disk sizes were measures in Megabytes, not Terabytes. and a (tiny by today's standards) 20 Megabyte drive was nearly $1000. See here: http://www.mattscomputertrends.com/harddiskdata.html) Joe L.
  5. Nice looking rig. Be aware, even though the hardware is hot-swap, unRAID is not hot-swap. At least one user caused a series of failures when hot-plugging an additional drive into an array caused it to re-assign drive IDs. He attempted to recover on his own, and ended up making an error that caused a loss of data on an entire drive. Always power down when plugging and un-plugging drives.
  6. It certainly sounds as if you've already created file-systems on the disks. If you have them assigned to your array, can you store files on them? (They would have shown as "Un-formatted" if unable to be mounted, and if they are mounted, they have a file-system.) I somehow think you pressed the "Format" button... at least once. I have never heard of unRAID doing it on its own, but then again, most of the time it is not presented with a pre-cleared disk before parity is established. Edit: I've since figured out that the act of assigning the drives to the array will write the MBR's partition table in preparation for the creation of a file-system on it. As far as the "warning" from fdisk about the partition not ending on a cylinder boundary. Yes, very normal. The Cylinder/Heads/Sectors notation is not used by modern disks. It is "faked" and reported so very-old BIOS can still interact with the disk. see here: http://www.pcguide.com/ref/hdd/geom/geomPhysical-c.html and here: http://www.pcguide.com/ref/hdd/geom/geomLogical-c.html Modern disks use a linear addressing scheme... Their bit density on the platters guarantees more than 63 sectors per cylinder, but that is as many as the field in the MBR would allow... The number of heads can be as high as 255, but again, no modern disk has that many.... typically they have 2 per platter, one on top and one on the bottom. Many disks will therefore report 1 head, 63 sectors per cylinder, and a much higher number of "cylinders" than actually present. The number of "sectors" per cylinder is not constant on modern disks... There are more on the outer tracks than on the inner ones to keep the bit density equal. (otherwise, the bits on the inner tracks would be much closer together, and harder to read accurately) Your disk "reported" 46,512,336 cylinders. If true, the arm holding the disk head would have to be able to be positioned to 46 million different positions... accurately... It reported 1 head... difficult to read the top AND bottom of the platter with only 1 head. In fact, most 1.5TB drives have either 3 or 4 platters... Impossible to read 6 (or 8 ) surfaces with 1 disk head. Don't worry about the "partition not ending on cylinder boundary" warning... it has no meaning on current drives. Joe L.
  7. Thanks Joe, here it is: Disk /dev/sde: 1500.3 GB, 1500301910016 bytes 1 heads, 63 sectors/track, 46512336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sde1 2 46512336 1465138552+ 83 Linux Partition 1 does not end on cylinder boundary. 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000448 0000 0083 0000 003f 0000 7af1 aea8 0000 0000464 0000 0000 0000 0000 0000 0000 0000 0000 * 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512 Disk /dev/sdg: 1500.3 GB, 1500301910016 bytes 1 heads, 63 sectors/track, 46512336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdg1 2 46512336 1465138552+ 83 Linux Partition 1 does not end on cylinder boundary. 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000448 0000 0083 0000 003f 0000 7af1 aea8 0000 0000464 0000 0000 0000 0000 0000 0000 0000 0000 * 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512 Disk /dev/sdb: 750.1 GB, 750156374016 bytes 1 heads, 63 sectors/track, 23256336 cylinders Units = cylinders of 63 * 512 = 32256 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 2 23256336 732574552+ 83 Linux Partition 1 does not end on cylinder boundary. 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0000448 0000 0083 0000 003f 0000 66b1 5754 0000 0000464 0000 0000 0000 0000 0000 0000 0000 0000 * 0000496 0000 0000 0000 0000 0000 0000 0000 aa55 0000512 --Bill I'm not sure how you managed it, but all three have the file-system type set to "83" (linux) at offset 450. This is not the normal value for a pre-cleared disk, but is what you would expect to find after a linux file system was created on the first partition. With that value unRAID would be very likely to assume a reiserfs exists and not do anything to the drive as the remaining bytes all seem to match what would exist AFTER a formatting step had occurred and a reiserfs file-system created on the drives. What did you do to the disks after you pre-cleared them? Did you format them? Did you attempt to mount them? Did you use "fdisk" on them? Something put the "83" file system type byte in there for the first partition... and it was not the pre-clear script. Joe L.
  8. To analyze what the script did, or did not do, please supply the output of the following two commands for each of the drives it says are not pre-cleared. fdisk -l /dev/??? and dd if=/dev/??? count=1 | od -x -A d (substitute ? ? ? for your correct drive designation) Joe L.
  9. New version of preclear_disk.sh does log the difference to the syslog. There was no good reason not to, just did not think of it until you mentioned it. Thanks for the suggestion. New 0.9.6 version has a few other improvements in reporting via e-mail (if you have a mail program configured on your server) and also reports drive temperature as the process is under way (in case the drive is overheating from lack of cooling air movement) New version is attached to first post on original thread here Joe L.
  10. As promised, there is now a updated version of the preclear_disk.sh script attached to the first post in this thread. It now is version 0.9.6: 5th edit: August 31, 2009. New version 0.9.6 ( 0.9.4 and 0.9.5 were internal versions as jbuszkie and I tested) # Version .9.4 - Enable SMART monitoring, just in case it is disabled on a drive. # Version .9.5 - Added disk temperature display and disk read speed display. # Version .9.6 - Enhanced the mail reporting to include some statistics (time, MB/s temp .. ect) # - Fixed a bug with using zero.txt and concurrent tests. Each test will use it's own file. # - Changed read block size to be bigger than 1,000,000 if smaller, to improve read speed There is nothing wrong with the previous version, but the newer one provides more statistics as it runs, especially if have configured mail on your server and you use the -m mailaddress@mailhost option to mail progress reports to yourself. This time, a lot of the improvements are credited to jbuszkie. He took my original script and made it even more informative. I just prettied things up once he made his improvements and sent them onward to me for review. Oh yes, the "diff" of the smart reports is now logged to the syslog... Makes it easier to see the changes to the results all in one spot. As always, if you spot anything we missed, or have a suggestion for an improvement, let me know. Joe L.
  11. Not normally, at least not before /etc/profile is run, and usually never for the "root" user to prevent abuse (although Tom has edited /etc/profile to add it, even for root). Since no login has occurred when the "go" script is being invoked, it is not using /etc/profile at all.
  12. When the "go" script is being run, the current working directory is probably "/" and the file you are trying to invoke is not in the search path. Try using either a full path to the script nohup /boot/whatever/s3.sh & or change directory to it first, and then use "nohup ./s3.sh" cd /boot/whatever nohup ./s3.sh &
  13. Neither block size makes the test any better, or worse... Certainly there will be more thrashing with the smaller block size. You are executing MANY more individual read commands. (255 times as many) so you are moving the disk head from first to last 255 times as much. The notion of Cylinders, heads, and sectors is a carry-over from the early days of disk drives... pre-dating even MS-DOS. Today's disks will have one disk head per platter. They will very likely have more than 63 sectors per track. The problem is the MBR record was designed with small fields that can't hold true values, so they are "faked" by the drive. The actual addressing of the disk is hidden from us entirely and is based on sector number. Trust me, you did not magically go from 1 disk head to 255 by clearing it. Your observation is interesting though, and it suggests we might add code to multiply the "Unit" by some value if it is under 1M, just to make the efficiency a bit better. I think I still want the math to work out so we do not have a partial "read" at the end of the disk. Something like this pseudo code might work tu=$units while [ "$units" -lt 1000000 ] do let units = $units+$tu done Joe L.
  14. [shame]Oops... you are correct... a bug...[/shame] I'll get over it though... It will be fixed in the next release of the preclear_disk.sh script. (due out any day now) It affected the progress display only... Joe L.
  15. It sounds as if your motherboard BIOS, (or disk controller) does not think there is any ability to use the hardware write-cache, but you know better. I'd just add the line to the end of the "go" script, exactly as you said. Glad you figured out how to enable the write-cache on your hardware. Also glad you got rid of the squeaky disk... It can only indicate friction as parts rub on each other... and that is never good for a hard-disk. Joe L.
  16. Look here for how to interpret your results: http://en.wikipedia.org/wiki/Self-Monitoring,_Analysis,_and_Reporting_Technology#Known_ATA_S.M.A.R.T._attributes As already described, your drive looks good. it does not show uncorrectable errors. But now you can verify for yourself.
  17. I still don't like what the disk is doing. Far too many uncorrectable errors, and they seem to be continuing. From what it appears, the pending-reallocation sectors were successfully written in their original locations and not re-allocated... interesting. Joe L.
  18. It is far more likely to be limited by the disk controller bandwidth, not the cpu or memory speeds. But, give it a try and let us know.
  19. I see an RMA in your future. Not sure why the 331 sectors were not re-allocated already, unless the failures were in the post-read, and no subsequent "write" has happened since then to those sectors. Joe L.
  20. I started the 1.5TB preclear process on Aug 16 at 14:52:19 It ended Aug 17 at 08:13:51 So, it looks like 17.2 hours. Your zeroing (writing) time is consistent with what I was saying... It is done linearly... so the disk does not have to move the disk heads very far or often compare to the read phases of pre-clear. Joe L.
  21. Wow... very nice speeds. Now I understand your comment. There is one HUGE difference... Linear addressing and read-ahead buffering vs. random blocks of data and forcing tons of head seeking. The pre/post read process is specifically designed to exercise the disk in was to uncover problems... before the disk ends up in the protected array. For each block of data read from the disk linearly, it also reads three random blocks of data from somewhere else on the disk and also reads the very first block on the disk, and the very last. Those last two two are read bypassing the disk buffer cache, so the disk head must make a sweep across the disk with the linear block in between somewhere. For disk parity, and parity checks, the disk head barely has to move, and when it does, it moves just one track at a time... I think that is the reason for the huge difference in speed. as I said, a normal "dd" command is a linear read of all the blocks in turn. you might find the "writing" phase of the preclear_disk script faster than the read phases... as it is a linear write to all the blocks... For it, the track-to-track seek time has far less effect. Joe L.
  22. After 14+ hours your disk temperature went from 25C to 28C. I'd say that in itself is not too serious but it does say you have some serious fans. The S.M.A.R.T. wiki here indicates that attribute 200 is You started with the default initialized value of 253, and after a full 14+ hour pre-clear cycle, it has a normalized value of 100. The failure threshold is 0. You are nowhere close to the failure threshold value, so unless it changes over time, you are fine there too. Joe L.
  23. Joe, I just got a new unraid MB and CPU and I'm currently testing it with two new Samsung 1.5T drives. I'm preclearing both and I'm not getting nowhere near the speeds you are. If yours was a PCI based system.. Mine is a new pci-e based system. I only have the two drives attached. Syslog says they are runnign in 3.0Gbs... But they are both going at a rate of about 25% every 4 hours for the preread. Even when I just did one drive I was getting 2GB/min ~ 34MB/s. I would expect a lot better than that! Right now I'm getting about 25.6MB/s. Am I missing something. In the log I see: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: smartctl version 5.38 [i486-slackware-linux-gnu] Copyright (C) 2002-8 Bruce Allen Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Home page is http://smartmontools.sourceforge.net/ Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: === START OF INFORMATION SECTION === Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device Model: SAMSUNG HD154UI Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Serial Number: S1Y6J1KS743788 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Firmware Version: 1AG01118 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: User Capacity: 1,500,301,910,016 bytes Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Device is: In smartctl database [for details use: -P show] Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Version is: 8 Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ATA Standard is: ATA-8-ACS revision 3b Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Local Time is: Fri Aug 21 23:17:43 2009 EDT Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: ==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Available - device has SMART capability. Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: SMART support is: Enabled Aug 21 23:17:43 Tower2 preclear_disk-start[14626]: What is this -F samsung? I'm running in AHCI mode set in the bios. Anything else I'm missing? Why would you expect your system to be faster? A PCI bus can do about 133 MB/s. When reading a single drive, it is easily able to keep up. My drive was a 7200 RPM drive, I think yours is a 5400 RPM drive (correct me if I'm wrong) I'd expect it to be a bit slower. Odds are good you are doing just fine. With 2 1.5TB drives, I think you are a bit slower, but then I can't tell from here is anything else looks interesting in your syslog (because you did not post it) Are you also copying files to the server at the same time, or doing a parity check or initial parity calculation? As far as the -F samsung.... did you try looking in the smartctl manual page as instructed. If your smartctl report looks otherwise normal, odds are you do not need the -F option. It certainly is not needed for the preclear script, as it just looks for differences, not absolute values.
  24. As far as the hot-plug causing harm... Look through this thread After a hot plug, and a reboot when it did not work as expected, the user accidentally started a "parity check" with a drive that was not mounted. It ran for a minute or two before he stopped it It read "zeros" from the un-mounted drive and changed parity accordingly... Later, when a replacement drive was installed, those zeros were written to it instead of the normal file-system structures. Basically, he had wiped his data, from both parity and the drive. That hot-plug initiated actions that resulted in one of the few cases I know of where unRAID lost data. All that said, stop your array, reboot, and you'll probably be fine. Oh yeah... don't hot-plug... always stop the array and power down. Joe L.
  25. It sounds as if the out-of-memory kernel process is killing processes on your server. deleting the syslog does not free the space it uses if there is still a process that has an open file-descriptor writing to it. The blocks are freed only after there are no more references to it, and a open file-descriptor is a reference. Some programs actually take advantage of this behavior and create a temp file, open it for reading and writing, then delete it. Until the file-descriptors are closed, the temp file is still readable and writable by that program... The memory (and space) is automatically freed when the program exists. To stop the old syslog process, and restart it, type the following: /etc/rc.d/rc.syslog restart It should free up the memory and you should then see the new syslog file you created start to be used.
×
×
  • Create New...