July 14, 201312 yr Hi I'm using UNRAID for a while now but never actually replaced a drive. When I needed mor storage I just simply added a drive. Since I don't want to expand my array anymore with more drives, yesterday I replaced one 1TB drive with a 2TB one to gain some space. First I've noticed that after swapping the drive the drive wasn't blueballed (as explained in the wiki) but was simply redballed (wrong drive). I've started the rebuild process anyway and after an unresponsive server behaviour of about 15 minuets, the array came back online and has started rebuilding data on the new drive. What worries me now is the slow rebuilding speed. Speeds average between 5 and 10 mb/s which is really slow, it will take a full 3 days to rebuild the data, strange cause initially building my parity drive went much much faster (a couple of hours). What are normal rebuilding speeds about to be ?
July 14, 201312 yr That sounds slow - I would expect speeds of 50+MB/sec although it can vary with hardware. You should post a syslog so we can see what errors (if any) are being reported.
July 16, 201312 yr Author This weekend speeds even dropped to a mere 2Mb/s. Since yesterday evening unraid main page fully unresponsive (unable to connect through webinterface). Array (shares) accessible though All drives spinnning (so definitely the process is still running). Maybe cpu or memory under full load or so (100%) Acces through console is possible, does anyone now how I can monitor the process in command line ?
July 16, 201312 yr I use a command like tail -n 100 -f /var/log/syslog You can also try copying the log file to the flash if you want a copy to survive a reboot. There is also a script (keeplogs.zip) that has been posted a number of times that will redirect the log file to the flash rather than to RAM as is the normal case. However you want to use this sparingly as it can fill up the USB stick with log information, and also shorten its lifetime due to excessive activity.
July 16, 201312 yr Author complete syslog part (as from where rebuild started till now): Jul 13 13:23:25 Tower emhttp_event: disks_mounted Jul 13 13:23:25 Tower kernel: mdcmd (55): check CORRECT Jul 13 13:23:25 Tower kernel: md: recovery thread woken up ... Jul 13 13:23:25 Tower kernel: md: recovery thread rebuilding disk10 ... Jul 13 13:23:25 Tower kernel: md: using 1536k window, over a total of 1953514552 blocks. Jul 13 13:23:26 Tower emhttp: shcmd (90): :>/etc/samba/smb-shares.conf Jul 13 13:23:26 Tower emhttp: Restart SMB... Jul 13 13:23:26 Tower emhttp: shcmd (91): killall -HUP smbd Jul 13 13:23:26 Tower emhttp: shcmd (92): ps axc | grep -q rpc.mountd Jul 13 13:23:26 Tower emhttp: _shcmd: shcmd (92): exit status: 1 Jul 13 13:23:26 Tower emhttp: shcmd (93): /usr/local/sbin/emhttp_event svcs_restarted Jul 13 13:23:26 Tower emhttp_event: svcs_restarted Jul 15 18:23:11 Tower kernel: r8168: eth0: link down Jul 15 18:23:12 Tower kernel: r8168: eth0: link down Jul 15 18:23:48 Tower kernel: r8168: eth0: link up Jul 15 18:23:48 Tower kernel: r8168: eth0: link up Jul 16 10:37:33 Tower kernel: mdcmd (56): spindown 11 Jul 16 10:37:34 Tower kernel: mdcmd (57): spindown 12 All disks accessible form console no unraid management web page though. Since al disks are still spinning it isn,t clear to me if the rebuild has finished and if I may restart the server .
July 17, 201312 yr Author Anyone an idea ? Can I just reboot the server during rebuild ? Will the process continue after reboot ? Or should I just wait a couple more days ?
July 17, 201312 yr The rebuild won't "pick back up" => it would have to start over. Can you access the Web GUI from ANOTHER PC? I've found that if one PC gets "hung" on the access, you can often get to it on a different PC ... or by rebooting your PC and then trying to access the Web GUI again.
July 18, 201312 yr Author Ok, access through another computer didn't work. Ive managed to get unmenu working though. Now at least i can see the status of the rebuild again In unmenu it indicates wrong disk, but the disk is rebuilding.. It will take another 2 days before completion, which means that It will take awhole week to rebuild a 2TB disk. Data building speed between 2Mb and 4Mb a second now, this is not normal, since I'm using a core to duo cpu, I can wriet the data by hand a lot faster :] Next I'm better of copying the data to the new disk and rebuilding parity (which only took a some hours , not days) Can I somehow retrieve the cause of this problem ? syslog doesn't mention anything weird
July 18, 201312 yr Author These are some errors I found in syslog on 13/07: Jul 13 13:10:19 Tower kernel: jmicron 0000:03:00.1: IDE controller (0x197b:0x2361 rev 0x02) (System) Jul 13 13:10:19 Tower kernel: scsi3 : ata_piix (Drive related) Jul 13 13:10:19 Tower kernel: scsi5 : ata_piix (Drive related) Jul 13 13:10:19 Tower kernel: ACPI Error: [FZHD] Namespace lookup failure, AE_NOT_FOUND (20120320/psargs-359) (Errors) Jul 13 13:10:19 Tower kernel: ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT1.CHN0._GTM] (Node f4443a38), AE_NOT_FOUND (20120320/psparse-536) (Errors) Jul 13 13:10:19 Tower kernel: ACPI Error: [FZHD] Namespace lookup failure, AE_NOT_FOUND (20120320/psargs-359) (Errors) Jul 13 13:10:19 Tower kernel: ACPI Error: Method parse/execution failed [\_SB_.PCI0.SAT1.CHN1._GTM] (Node f4443b28), AE_NOT_FOUND (20120320/psparse-536) (Errors) And some minor problems (indicated in unmenu): Jul 13 13:10:21 Tower emhttp: shcmd (: killall -HUP smbd (Minor Issues)
July 19, 201312 yr I'm also running parity sync and it was very slow yesterday. I got like 30MB/s. I ran parity check before this and it reported no errors. Now I have one disc that has 269,899,907 errors, but the parity drive that is being rebuilt reports none. ATM I got 92MB/s, but the web GUI is unresponsive at times (slow), but parity rebuilt seems still to be in progress. I got about couple of hours left. Does this mean that the one drive is failing, but it still can be read even though it takes couple of re-reads? Cannot get a syslog as I cannot get connection with telnet. If the parity sync finishes, should I run a parity check after that? Does it mean that if it finishes it has got the data from the drive that has errors and the parity is valid?
July 19, 201312 yr Data building speed between 2Mb and 4Mb a second now, this is not normal, since I'm using a core to duo cpu, ... I can wriet the data by hand a lot faster :] I'd like to see that Two million bytes/second, with an average of 5 characters in a word, is about 400,000 words/second ... or 24,000,000 words/minute ==> about 300,000 times faster than a VERY good typist who can do 80 words/minute
July 19, 201312 yr If a drive is having read errors then the parity sync is useless. Stop building parity. Attach a syslog and post a SMART report.
July 19, 201312 yr If a drive is having read errors then the parity sync is useless. Stop building parity. Attach a syslog and post a SMART report. ATM the server is unresponsive, if I reboot I cannot get a useful syslog? Was replacing the 2tb parity drive with a 3tb drive. I still have the original parity drive. One of the data drives showed the errors. Should I replace the new 3tb drive with the old 2tb and select trust parity? EDIT: I now rebooted the server and here's the syslog and smart for the drive that had the errors. I also had "Write corrections to parity" checked, is the parity now unreliable? smart.zip syslog-2013-07-19.zip
July 19, 201312 yr Put the original parity drive back in and check all of the disk connections and cabling. Reset the config and trust parity. Do a parity check and see if any drives are having read errors. There should be no errors of any kind and zero parity updates.
July 19, 201312 yr Thanks dgaschk! Running parity check now with the old parity drive and at 0.3% already 187 errors corrected. It's still slower than usual (50-60MB/s) at about 20-30MB/s. Because the parity check was good before installing the new drive I think (and hope) this was just a case of a loose cable. BTW, my drives are like really full. Does that effect to the read speed?
July 20, 201312 yr Do any drives show read errors? Last checked on Sat Jul 20 08:13:11 2013 EEST (today), finding 193 errors. > Duration: 12 hours, 29 minutes, 29 seconds. Average speed: 44.5 MB/sec Shows no read errors on any of the discs. So should be fine to replace the parity... again...
July 20, 201312 yr Do any drives show read errors? Last checked on Sat Jul 20 08:13:11 2013 EEST (today), finding 193 errors. > Duration: 12 hours, 29 minutes, 29 seconds. Average speed: 44.5 MB/sec Shows no read errors on any of the discs. So should be fine to replace the parity... again... Run another parity check. There should be zero errors. A parity check with zero corrections should be completed before replacing a disk.
Archived
This topic is now archived and is closed to further replies.