BRiT Posted December 29, 2009 Share Posted December 29, 2009 You can run multiple preclear_discs on the server, either open multiple telnet/ssh terminals or on the physical console. Quote Link to comment
Joe L. Posted December 29, 2009 Share Posted December 29, 2009 Post a syslog. It is the only way to see the errors, if there are any. Attach it to your next post. Just checked syslog and there's nothing there other than my Telnet logins to the box and the odd spinup/spindown from the kernels due to inactivity on the box. I'm going to try and transfer the SATA interface from the Mobo to a separate SuperMicro 8 SATA card I have in that box and see if that makes any difference. Not seeing any errors, but there is definitely something weird about the speed with this drive. The other drives didn't have anywhere near this sort of problem. Myles There may not be specific errors, but there will be lines showing how the disk itself was initialized by the disk controller. If it was initialized in PIO mode rather than a DMA mode it will be vastly slower. If you know what to look for, look for all entries specific to that drive. If you are not an expert in syslog analysis, post a full syslog. Joe L. Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 Joe, as you know, I've been running 10 preclear cycles on two 2TB Hitachi drives. They are currently at 83% done on cycle 6 post-read @ 170 hrs. The problem is, my telnet windows seem to have stuck or something, because I am no longer getting refreshed on the progress. I checked the read / write columns in myMain under unMenu, and the numbers aren't changing. When I touch the drives it feels like they're still spinning, and unMenu seems to confirm this. This is the past few days in the syslog... Jan 14 02:21:08 Tower kernel: sdb: sdb1 Jan 14 02:21:19 Tower kernel: udev: starting version 130 Jan 14 02:48:36 Tower kernel: sda: sda1 Jan 14 02:48:47 Tower kernel: udev: starting version 130 Jan 15 07:17:50 Tower kernel: sdb: sdb1 Jan 15 07:18:00 Tower kernel: udev: starting version 130 Jan 15 07:49:58 Tower kernel: sda: sda1 Jan 15 07:50:09 Tower kernel: udev: starting version 130 Jan 16 09:15:13 Tower unmenu[1256]: gawk: ./08-unmenu-array_mgmt.awk:115: warning: escape sequence `\'' treated as plain `'' Quote Link to comment
Joe L. Posted January 16, 2010 Share Posted January 16, 2010 Joe, as you know, I've been running 10 preclear cycles on two 2TB Hitachi drives. They are currently at 83% done on cycle 6 post-read @ 170 hrs. The problem is, my telnet windows seem to have stuck or something, because I am no longer getting refreshed on the progress. I checked the read / write columns in myMain under unMenu, and the numbers aren't changing. When I touch the drives it feels like they're still spinning, and unMenu seems to confirm this. This is the past few days in the syslog... Jan 14 02:21:08 Tower kernel: sdb: sdb1 Jan 14 02:21:19 Tower kernel: udev: starting version 130 Jan 14 02:48:36 Tower kernel: sda: sda1 Jan 14 02:48:47 Tower kernel: udev: starting version 130 Jan 15 07:17:50 Tower kernel: sdb: sdb1 Jan 15 07:18:00 Tower kernel: udev: starting version 130 Jan 15 07:49:58 Tower kernel: sda: sda1 Jan 15 07:50:09 Tower kernel: udev: starting version 130 Jan 16 09:15:13 Tower unmenu[1256]: gawk: ./08-unmenu-array_mgmt.awk:115: warning: escape sequence `\'' treated as plain `'' No way to know without seeing a process list ps -ef the script ran into one bug in the "bash" shell where it could not deal with more than 4096 forks/waits of a sub-shell. Are you using the fixed version? (it has been fixed for a while, but I have n idea how long ago you got your version) It had the same effect, a freeze at a certain point in the clear process as a child process never was waited for properly. Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 I'm using the latest download from here. The modified date is 10/6/2009 9:15. Process list: UID PID PPID C STIME TTY TIME CMD root 1 0 0 Jan08 ? 00:00:02 init root 2 0 0 Jan08 ? 00:00:00 [kthreadd] root 3 2 0 Jan08 ? 00:00:00 [migration/0] root 4 2 0 Jan08 ? 00:00:00 [ksoftirqd/0] root 5 2 0 Jan08 ? 00:00:00 [migration/1] root 6 2 0 Jan08 ? 00:00:00 [ksoftirqd/1] root 7 2 0 Jan08 ? 00:00:00 [events/0] root 8 2 0 Jan08 ? 00:00:00 [events/1] root 9 2 0 Jan08 ? 00:00:00 [khelper] root 14 2 0 Jan08 ? 00:00:00 [async/mgr] root 107 2 0 Jan08 ? 00:00:00 [kblockd/0] root 108 2 0 Jan08 ? 00:00:00 [kblockd/1] root 109 2 0 Jan08 ? 00:00:00 [kacpid] root 110 2 0 Jan08 ? 00:00:00 [kacpi_notify] root 111 2 0 Jan08 ? 00:00:00 [kacpi_hotplug] root 187 2 0 Jan08 ? 00:00:00 [ata/0] root 188 2 0 Jan08 ? 00:00:00 [ata/1] root 189 2 0 Jan08 ? 00:00:00 [ata_aux] root 193 2 0 Jan08 ? 00:00:00 [ksuspend_usbd] root 198 2 0 Jan08 ? 00:00:00 [khubd] root 201 2 0 Jan08 ? 00:00:00 [kseriod] root 262 2 0 Jan08 ? 01:42:23 [pdflush] root 263 2 0 Jan08 ? 01:39:06 [pdflush] root 264 2 3 Jan08 ? 07:13:20 [kswapd0] root 307 2 0 Jan08 ? 00:00:00 [aio/0] root 308 2 0 Jan08 ? 00:00:00 [aio/1] root 314 2 0 Jan08 ? 00:00:00 [nfsiod] root 319 2 0 Jan08 ? 00:00:00 [cifsoplockd] root 547 2 0 Jan08 ? 00:00:00 [usbhid_resumer] root 553 2 0 Jan08 ? 00:00:00 [rpciod/0] root 554 2 0 Jan08 ? 00:00:00 [rpciod/1] root 725 2 0 Jan08 ? 00:00:00 [scsi_eh_0] root 726 2 0 Jan08 ? 00:00:00 [usb-storage] root 728 2 0 Jan08 ? 00:00:00 [scsi_eh_1] root 729 2 0 Jan08 ? 00:00:00 [scsi_eh_2] root 1051 1 0 Jan08 ? 00:00:00 /usr/sbin/syslogd -m0 root 1055 1 0 Jan08 ? 00:00:00 /usr/sbin/klogd -c 3 -x root 1094 1 0 Jan08 ? 00:00:00 /usr/sbin/ifplugd -i eth0 -fwI - bin 1102 1 0 Jan08 ? 00:00:00 /sbin/rpc.portmap nobody 1106 1 0 Jan08 ? 00:00:00 /sbin/rpc.statd root 1116 1 0 Jan08 ? 00:00:00 /usr/sbin/inetd root 1126 1 0 Jan08 ? 00:00:00 /usr/sbin/acpid root 1133 1 0 Jan08 ? 00:00:00 /usr/sbin/crond -l10 daemon 1135 1 0 Jan08 ? 00:00:00 /usr/sbin/atd -b 15 -l 1 root 1140 1 0 Jan08 ? 00:00:12 /usr/sbin/nmbd -D root 1142 1 0 Jan08 ? 00:00:00 /usr/sbin/smbd -D root 1144 1142 0 Jan08 ? 00:00:00 /usr/sbin/smbd -D root 1150 1 0 Jan08 ? 00:00:00 /usr/local/sbin/emhttp root 1155 1 0 Jan08 tty1 00:00:00 /sbin/agetty 38400 tty1 linux root 1156 1 0 Jan08 tty2 00:00:00 /sbin/agetty 38400 tty2 linux root 1158 1 0 Jan08 tty3 00:00:00 /sbin/agetty 38400 tty3 linux root 1160 1 0 Jan08 tty4 00:00:00 /sbin/agetty 38400 tty4 linux root 1167 2 0 Jan08 ? 00:00:00 [mdrecoveryd] root 1174 1 0 Jan08 tty5 00:00:00 /sbin/agetty 38400 tty5 linux root 1176 1 0 Jan08 tty6 00:00:00 /sbin/agetty 38400 tty6 linux root 1235 1 0 Jan08 ? 00:00:00 /usr/sbin/ntpd -g -p /var/run/nt root 1255 1 0 Jan08 ? 00:00:00 /bin/bash ./uu root 1256 1 0 Jan08 ? 00:00:00 logger -tunmenu -plocal7.info -i root 1257 1255 0 Jan08 ? 00:00:01 awk -W re-interval -f ./unmenu.a root 6576 1 0 Jan15 ? 00:00:00 udevd --daemon root 25154 1116 0 10:35 ? 00:00:00 in.telnetd: 192.168.10.199 root 25155 25154 0 10:35 pts/1 00:00:00 -bash root 25166 25155 0 10:35 pts/1 00:00:00 ps -ef Quote Link to comment
Joe L. Posted January 16, 2010 Share Posted January 16, 2010 I don't see it running at all in the process list. Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 Me either. It completely stopped running, on both drives, simultaneously. Is there another log that I can look at to see what happened? Should I just restart with the remaining number of cycles? Quote Link to comment
Joe L. Posted January 16, 2010 Share Posted January 16, 2010 Me either. It completely stopped running, on both drives, simultaneously. Is there another log that I can look at to see what happened? Should I just restart with the remaining number of cycles? You can look in your syslog to see if the kernel killed the processes for any reason (it needed the memory, and the "bash shell" was using it all) The starting and ending "smart" reports are in the /tmp directory, named after their process IDs. The end report from the process is just a "diff /tmp/smart_startNNNN /tmp/smart_finishNNNN" of the two files. I'd just start it again, for the remaining cycles. Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 Syslog didn't reveal anything. I'll just start the final 4 cycles. Thanks. Quote Link to comment
Joe L. Posted January 16, 2010 Share Posted January 16, 2010 Only other thing I can think of is if you logged off the terminal running the pre-clear sessions. They would terminate themselves. Joe L. Quote Link to comment
purko Posted January 16, 2010 Share Posted January 16, 2010 Aiden, are you running the preclear_disk.sh from the console or from telnet? If it's telnet, have you considered runing it on top of screen ? Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 I'm running the cycles via 2 telnet sessions from a laptop that I leave on 24/7. Never logged off the server, never shut it down, and the telnet windows have been moved, minimized, maximized, etc without any issues thus far. It's not a big issue to me, I can easily restart the remaining cycles. I was more curious if it was something more fundamental, like the system just got bored. Quote Link to comment
Joe L. Posted January 16, 2010 Share Posted January 16, 2010 I was more curious if it was something more fundamental, like the system just got bored. Probably not bored... perhaps tired, and may be eager to start using the new disks... but not bored... In the interim, at least you will have given them a good initial burn in. Quote Link to comment
aiden Posted January 16, 2010 Share Posted January 16, 2010 In the interim, at least you will have given them a good initial burn in. Do you think it's enough burn in? 5 cycles and 170 hours? Quote Link to comment
prostuff1 Posted January 16, 2010 Share Posted January 16, 2010 In the interim, at least you will have given them a good initial burn in. Do you think it's enough burn in? 5 cycles and 170 hours? Yes, I think most people here only do a single cycle which can take a day almost with the newer drives. I usually run 3 cycles on mine. I run the first, get the diff, then start another pass to see if anything has changed, then I do one more cycle to make sure that nothing is going to change. Quote Link to comment
garycase Posted February 20, 2010 Share Posted February 20, 2010 Okay, this looks like a nifty utility, but I want to be CERTAIN I'm doing this right so I don't destroy data already on the array. (If I understand it correctly, the script won't let me do that -- but just to be sure ...) So to use it, I do this: (1) Copy the script to the Flash drive (from Windows Explorer) (2) Run a Telnet client and type "o tower", then "root" to get a prompt from the UnRAID box (3) cd /boot to get to the flash drive (4) preclear_disk.sh /dev/sdX to start the preclear process Assuming that's all correct, I have a few questions ... (a) How do I determine what "X" is for line 4? Is there a Linux command that will list my drives with their serial numbers? (b) Do I have to leave the Telnet window open for the entire PreClear process? © Is there any analog to this process to test drives already in the array? In particular, what does the "Long SMART Test" do in UnMenu on the Disk Management section? I've installed UnMenu to look around, but am not sure I understand all of the various options. Is there a good "manual" to read through to help with this? Quote Link to comment
purko Posted February 20, 2010 Share Posted February 20, 2010 (a) How do I determine what "X" is for line 4? Is there a Linux command that will list my drives with their serial numbers? ls -la /dev/disk/by-id/ (b) Do I have to leave the Telnet window open for the entire PreClear process? Yes. If you want to be able to disconnect from the session, then run the preclear script in "screen". See this: http://lime-technology.com/forum/index.php?topic=2817.msg24827#msg24827 Quote Link to comment
garycase Posted February 21, 2010 Share Posted February 21, 2010 Thanks -- although I realized after I'd asked the question that I can simply look in UnMenu on the Disk Management page and see the Linux designations for each of my disks r.e. my question #3 ==> are the tests shown in UnMenu (i.e. the Short & Long SMART tests) "data safe" ?? i.e. can they be run without any concern for the data on the array? And do they impact the availability of the array (i.e. can you be streaming a movie while the test is in progress) ?? Quote Link to comment
Joe L. Posted February 21, 2010 Share Posted February 21, 2010 Thanks -- although I realized after I'd asked the question that I can simply look in UnMenu on the Disk Management page and see the Linux designations for each of my disks r.e. my question #3 ==> are the tests shown in UnMenu (i.e. the Short & Long SMART tests) "data safe" ?? yes. They are read-only tests of the drives i.e. can they be run without any concern for the data on the array?Yes, they can be run at any time. And do they impact the availability of the array (i.e. can you be streaming a movie while the test is in progress) ?? You can watch a movie at the same time. You should disable the spin-down timer, since if unRAID forces a drive to spin down it will abort a long test. You will probably want to use the newest "Disk-management plug-in version" I think 1.4 is the newest. They are attached here: http://lime-technology.com/forum/index.php?topic=4993.msg46057#msg46057 Quote Link to comment
garycase Posted February 26, 2010 Share Posted February 26, 2010 I've experimented with a few disks and have noted that the post-read takes appreciably longer than the pre-read. Is this normal? For example, with a 500GB disk it took ~ 2 hrs for the pre-read; 1.5 hrs for zeroing; and over 3 hrs for the post-read. The SMART data was fine, and no errors were reported. I had similar results on an 80GB drive I tried (not going to actually use that -- but wanted something to compare with). Quote Link to comment
Joe L. Posted February 26, 2010 Share Posted February 26, 2010 I've experimented with a few disks and have noted that the post-read takes appreciably longer than the pre-read. Is this normal? For example, with a 500GB disk it took ~ 2 hrs for the pre-read; 1.5 hrs for zeroing; and over 3 hrs for the post-read. The SMART data was fine, and no errors were reported. I had similar results on an 80GB drive I tried (not going to actually use that -- but wanted something to compare with). Yes, it is expected, since during the post read we are verifying that the bytes read back are all zero. That verification step is not needed (or performed) on the pre-read since the contents could be anything. The verification step was added after one user found a drive that, when read, gave occasional "1" bits set where zeros were written. Not frequently, but enough to drive him crazy with parity errors since each time a check was done an occasional bit here and there would not be correct and it would "fix" parity to match. Of course, it was actually reading back bad data and making parity match the bad data. Joe L. Quote Link to comment
samukas Posted March 6, 2010 Share Posted March 6, 2010 Hey everyone! I have at the moment 4 discs on my array and I just added 4 more (Samsung green 1.5TB disks) last week to grow the array. After the disks were installed, I ran simultaneously preclear_disk on those disks and I got errors on every disk. I thought that was very strange, so I re-ran the 1st disk with pre-clear to see what would come up. I took 2 pictures of the errors I got on the 2nd run (I don't have any of the first run, sorry, but there were more errors does preclear keep a log file somewhere?) Are these errors that I should be concerned about or can I add the drives to the array? What I thought was strange was that each and every one of the new disks had errors... This didn't happened to me with the first 4 disks (those are WD GReen). Quote Link to comment
prostuff1 Posted March 6, 2010 Share Posted March 6, 2010 No, those look fine. The ones you need to be concerned about are the reports that come back and have a lot of reallocated sectors and current pending sectors. Quote Link to comment
samukas Posted March 11, 2010 Share Posted March 11, 2010 Ok, so I did what you suggested,prostuff1. I had already re-ran preclear on that drive that I took the printscreen with, so I did it again with the other 3 drives. 2 of them also gave those errors, but I just ignored them. The last one, however, isn't "pre-clearing"... I've already repeated the process 3 times but it just gets to 11% of the 1st step and then it doesn't do anything else, it just stays there... Time increases, but it just goes on and on without reading the drive... Any ideas? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.