Robertxc Posted October 3, 2009 Share Posted October 3, 2009 I've down complete fresh install of Unraid and having assigned all the drives I am now doing the parity sync. I has now been syncing for over 24 hours and it says it still has 1810 minutes to go!! Surely this is not normal? Also, all the drives except the parity are spun down, and judging by the temperature fo the parity drive, it doesn't seem to be doing much either! ny suggestions? Syslog is here: http://pastebin.com/mf286b91 Quote Link to comment
NLS Posted October 3, 2009 Share Posted October 3, 2009 Is the "current position" moving? If not, I would restart the whole thing. Quote Link to comment
Robertxc Posted October 3, 2009 Author Share Posted October 3, 2009 Yes, it is moving...barely. Quote Link to comment
Robertxc Posted October 3, 2009 Author Share Posted October 3, 2009 Would it help if i stopped the array? Would the parity sync continue? Quote Link to comment
purko Posted October 3, 2009 Share Posted October 3, 2009 Hi Robertxc! Some things in the syslog look suspicious to me... Oct 2 15:19:37 Tower emhttp: shcmd (41): mkdir /mnt/disk2 Oct 2 15:19:37 Tower emhttp: shcmd (41): mkdir /mnt/disk1 Oct 2 15:19:37 Tower kernel: mdcmd (7): check Oct 2 15:19:37 Tower kernel: md: recovery thread woken up ... Oct 2 15:19:37 Tower kernel: md: recovery thread syncing parity disk ... Oct 2 15:19:37 Tower emhttp: shcmd (42): mount -t reiserfs -o noacl,nouser_xattr,noatime,nodiratime /dev/md1 /mnt/disk1 >/dev/null 2>&1 Oct 2 15:19:37 Tower emhttp: shcmd (42): mkdir /mnt/disk3 The above shows that the system started the parity synch before it even mounted all the disks. I am not sure if it is normal or not. Later in the syslog I notice that the system is putting the disks to sleep: Oct 3 00:48:09 Tower emhttp: shcmd (63): /usr/sbin/hdparm -y /dev/sda >/dev/null Oct 3 00:48:09 Tower emhttp: shcmd (64): /usr/sbin/hdparm -y /dev/sdd >/dev/null Oct 3 00:48:09 Tower emhttp: shcmd (65): /usr/sbin/hdparm -y /dev/sdc >/dev/null Oct 3 00:48:09 Tower emhttp: shcmd (66): /usr/sbin/hdparm -y /dev/sde >/dev/null Oct 3 00:48:09 Tower emhttp: shcmd (67): /usr/sbin/hdparm -y /dev/sdb >/dev/null Now this is not right, assuming that parity synch is running and it is reading all disks all the time. This is probably the buggy part. On my system I have set up unraid NOT to put the disks to sleep. My reasoning is that if a disk can be put to sleep, then it can do it by itself, doesn't need a command when to do it. All it needs is to set up its inactivity sleep timer. So in my 'go' script I have the following line: for i in /dev/[sh]d? ; do hdparm -S200 $i >/dev/null 2>&1 ; done Also, I notice that your unraid is constantly trying to talk with some DHCP server: Oct 2 23:17:33 Tower dhcpcd[1403]: DHCP_ACK received from (192.168.0.1) Oct 2 23:47:33 Tower dhcpcd[1403]: sending DHCP_REQUEST for 192.168.0.190 to 192.168.0.1 Oct 2 23:47:34 Tower dhcpcd[1403]: dhcpIPaddrLeaseTime=3600 in DHCP server response. Oct 2 23:47:34 Tower dhcpcd[1403]: dhcpT1value is missing in DHCP server response. Assuming 1800 sec Oct 2 23:47:34 Tower dhcpcd[1403]: dhcpT2value is missing in DHCP server response. Assuming 3150 sec Oct 2 23:47:34 Tower dhcpcd[1403]: DHCP_ACK received from (192.168.0.1) Oct 3 00:17:34 Tower dhcpcd[1403]: sending DHCP_REQUEST for 192.168.0.190 to 192.168.0.1 Oct 3 00:17:35 Tower dhcpcd[1403]: dhcpIPaddrLeaseTime=3600 in DHCP server response. Oct 3 00:17:35 Tower dhcpcd[1403]: dhcpT1value is missing in DHCP server response. Assuming 1800 sec Oct 3 00:17:35 Tower dhcpcd[1403]: dhcpT2value is missing in DHCP server response. Assuming 3150 sec Why do you need that? Just set up its IP address manually, and tell it not to use DHCP. All this said, I may still be missing your real problem. Maybe somebody more knowlegeable can chime in. Yours, Purko Quote Link to comment
Robertxc Posted October 3, 2009 Author Share Posted October 3, 2009 Thanks Purko. I think I might just reboot and start again. There's no particular reason why I set it to get its IP address automatically, I am running a DHCP server anyway, and all the other machines on my network use it. I could just as easily give it a fixed one. I didn't think it would cause any issues for Unraid. Quote Link to comment
prostuff1 Posted October 3, 2009 Share Posted October 3, 2009 It usually works better if you set the server to get the same IP address on every boot. I do this by reserving the MAC address of the server in my router and tell it to assign the server a certain IP. In other words I leave the server on DHCP but tell the router that when it sees a certain MAC it needs to assign it this IP address. Quote Link to comment
Joe L. Posted October 3, 2009 Share Posted October 3, 2009 IF all your disks are different sizes that the smaller will go to sleep once they are no longer being read as part of the parity calculations. What will provide clues for people to give better assistance is for you to post a copy of your syslog... otherwise, we are just guessing. All it would take is one disk to be mis0behaving to get the times you are seeing. And no, with a 2TB set of disks and a PCI only bus, it could take 24 hours or more. Joe L. Quote Link to comment
Robertxc Posted October 3, 2009 Author Share Posted October 3, 2009 IF all your disks are different sizes that the smaller will go to sleep once they are no longer being read as part of the parity calculations. What will provide clues for people to give better assistance is for you to post a copy of your syslog... otherwise, we are just guessing. All it would take is one disk to be mis0behaving to get the times you are seeing. And no, with a 2TB set of disks and a PCI only bus, it could take 24 hours or more. Joe L. Do you mean a syslog in addition to the one I linked to in my first post? I have since restarted the parity sync and (so far) it seems to be behaving normally. The most up to date (since restarting the parity sync) syslog is here: http://pastebin.com/m7dc5a899 Quote Link to comment
RobJ Posted October 4, 2009 Share Posted October 4, 2009 Your syslogs look pretty good, until it fails to finish the parity build, and that is probably going to remain a mystery. You have restarted, and another parity build has begun, but if it fails too, then I recommend that you Obtain a SMART report for your parity drive. The parity build proceeded without issue through all of the 750GB drives (at the 50% point), then spun them down, then finished the 4 1TB drives (at the 67% point) and spun them down, then had nothing left to do but zero out all remaining sectors on the parity drive. It did start that, but has only gotten to 73% and something has gone very wrong, but there are absolutely NO errors reported. There is possibly a problem within the drive at the high end, so perhaps the SMART report will show what is wrong. I have never seen a problem before with the parity build failing at this point, don't have any ideas other than a possible problem at the high end of the drive. Depending on what the SMART report reveals, you should probably try Preclearing the parity drive. Concerning DHCP, you have changed it to a static IP, which I too recommend, but you still may want to check your DHCP server, wherever it is (probably on the router), as it looks mis-configured. It currently appears to require renewal every 30 minutes, which is a bit ridiculous. Although this is up to your personal preference, I would recommend setting it to at least one or 2 days, preferably one or 2 weeks. It no longer matters to your unRAID server, but still affects your other networked machines, that may be using it, but is only a minor issue. One other possible issue, don't know if you want to mess with it, but Disk 6 (sde) is actually a 1TB drive, but has been configured as a 750GB drive. You can easily confirm by checking the model numbers on your Web Management screen, and the syslog confirms there is an HPA covering the last 250GB. This probably indicates a drive that was sold as a 750GB drive, when they may have had a surplus of 1TB drives, and they just configured it to look like a 750GB. If you use one of the tools to raise its capacity back to 1TB, you will have to test it thoroughly, to make sure there are not flaws out there that are the reason for 'shorting' it. And you would also have to rebuild parity again, if you change the drive's size (but you may have to rebuild parity again anyway). Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.