ives Posted November 8, 2014 Share Posted November 8, 2014 Hi, I'm having a lot of trouble building a NAS. I bought all new hardware to build the nas: intel Core i3 4130 3.40GHz Socket 1150 3MB Cache Asus B85M-G Socket 1150 VGA DVI HDMI 6+2 Channel HD Audio mATX Motherboard 8gb Corsair XMS3 DDR3 1600MHz C11 XMS3 3x WD 2TB Green Desktop Hard Drive SATA 3.5 Be Quiet Straight Power 400W Fully Wired 80+ Gold Power Supply And initially things seemed ok. but then I experienced extreemly slow uploads. I ran memtest and round that my RAM had over 8000 errors, so i RMA-ed it and got some Kingston memory. Memtest said the Kingston was error free. I decided to start over and reformat and do a parity sync. However, this is taking a VERY long time (2mb/s) so I'm suspecting I might also have a problem with one or more of my WD green 2tb hdd. I've run hdparm and these are the results from the 3 drives: /dev/sdd: Timing cached reads: 16940 MB in 2.00 seconds = 8487.40 MB/sec Timing buffered disk reads: 60 MB in 3.01 seconds = 19.90 MB/sec root@Tower:~# hdparm -tT /dev/sdc /dev/sdc: Timing cached reads: 16738 MB in 2.00 seconds = 8386.48 MB/sec Timing buffered disk reads: 350 MB in 3.00 seconds = 116.50 MB/sec root@Tower:~# hdparm -tT /dev/sdb /dev/sdb: Timing cached reads: 13020 MB in 2.00 seconds = 6520.22 MB/sec Timing buffered disk reads: 10 MB in 3.17 seconds = 3.15 MB/sec root@Tower:~# obviously they vary quite a bit in the Timing buffered disk reads. sdb in particular looks very slow compared to the others at 3.15 MB/sec. Should I return this disk and get a new one? Also, sdd looks very slow (19.90 MB/sec) compared to sdc (116.50 MB/sec) thanks for any help. Quote Link to comment
Lacehim Posted November 8, 2014 Share Posted November 8, 2014 Did you run a pre-clear on those disks before you started using them? Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 no I didnt preclear.Should I have? Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 no I didnt preclear.Should I have? Yes. Then check the smart reports. For now check the smart reports with smartctl -a /dev/sd? where ? is the drive letter of the drives in the array. Post them and we can take a look. Post a syslog, there could be something that jumps out that could point to the issue. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 thanks for that. Here's sdb: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 166 165 021 Pre-fail Always - 4675 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 59 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 198 198 000 Old_age Always - 382 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 151 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 33 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 3582 194 Temperature_Celsius 0x0022 128 120 000 Old_age Always - 19 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged] sdc: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 168 167 021 Pre-fail Always - 4591 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 59 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 151 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 3887 194 Temperature_Celsius 0x0022 128 120 000 Old_age Always - 19 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged sdd: SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 167 165 021 Pre-fail Always - 4641 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 60 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 151 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 30 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 3744 194 Temperature_Celsius 0x0022 128 119 000 Old_age Always - 19 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 Nothing jumps out on the smart reports BTW, use CODE instead of QUOTE when attaching screen capture or log snippets. It uses fixed width fonts which line up like a screen. Makes it quicker/easier (at least for this old dog) to review. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 syslog:see attached file syslog.txt Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 it looks like you have a parity check that is going on and has not finished, that will slow everything down. Perhaps I missed something. Nov 7 10:02:17 Tower kernel: mdcmd (13): start NEW_ARRAY Nov 7 10:02:18 Tower kernel: mdcmd (14): check CORRECT .. Nov 7 18:38:58 Tower login[2023]: ROOT LOGIN on '/dev/pts/0' from 'acer' At the end of mine I have # grep sync /var/log/syslog Nov 7 03:50:54 unRAID1 kernel: md: sync done. time=31713sec Nov 7 03:50:55 unRAID1 kernel: md: recovery thread sync completion status: 0 So until you see the md: sync done. the machine is going to be pretty busy. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 Hi and thanks. My real concern is that the parity check is so slow. It's done 20% and has been running for 15 hours and is at 200kb/s Surely that indicates that something is wrong? Quote Link to comment
itimpi Posted November 8, 2014 Share Posted November 8, 2014 Hi and thanks. My real concern is that the parity check is so slow. It's done 20% and has been running for 15 hours and is at 200kb/s Surely that indicates that something is wrong? I agree that sound slow. With 2TB drives I would only expect it take something of the order of 4 hours. Does the syslog you provided cover the period where the parity sync is running? If there are any problems with any of the drives I would expect there to be error reports in the syslog. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 hi, and thanks for replying. Yes, the sys log is everything from when I booted the machine to install unraid up untill about 2am this morning. At that time it had been doing a parity sync for about 9 hours . Looking at the latest sys log after 15 hours in unmenu I can see errors. the 3 error lines are: Nov 7 10:02:17 Tower logger: missing codepage or helper program, or other error (Errors) Nov 7 10:02:17 Tower logger: missing codepage or helper program, or other error (Errors) Nov 7 10:02:17 Tower emhttp: disk2 mount error: 32 (Errors) I've uploaded the latest sys log . This is the section with errors : Nov 7 10:02:17 Tower logger: mount: wrong fs type, bad option, bad superblock on /dev/md1, Nov 7 10:02:17 Tower logger: missing codepage or helper program, or other error (Errors) Nov 7 10:02:17 Tower logger: In some cases useful info is found in syslog - try Nov 7 10:02:17 Tower logger: dmesg | tail or so Nov 7 10:02:17 Tower logger: Nov 7 10:02:17 Tower emhttp: _shcmd: shcmd (36): exit status: 32 (Other emhttp) Nov 7 10:02:17 Tower emhttp: disk1 mount error: 32 (Errors) Nov 7 10:02:17 Tower emhttp: shcmd (37): rmdir /mnt/disk1 (Other emhttp) Nov 7 10:02:17 Tower emhttp: shcmd (38): mkdir /mnt/disk2 (Routine) Nov 7 10:02:17 Tower emhttp: shcmd (39): set -o pipefail ; mount -t reiserfs -o user_xattr,acl,noatime,nodiratime /dev/md2 /mnt/disk2 |$stuff$ logger (Other emhttp) Nov 7 10:02:17 Tower logger: mount: wrong fs type, bad option, bad superblock on /dev/md2, Nov 7 10:02:17 Tower logger: missing codepage or helper program, or other error (Errors) Nov 7 10:02:17 Tower logger: In some cases useful info is found in syslog - try Nov 7 10:02:17 Tower logger: dmesg | tail or so Nov 7 10:02:17 Tower logger: Nov 7 10:02:17 Tower emhttp: _shcmd: shcmd (39): exit status: 32 (Other emhttp) Nov 7 10:02:17 Tower emhttp: disk2 mount error: 32 (Errors) Nov 7 10:02:17 Tower emhttp: shcmd (40): rmdir /mnt/disk2 (Other emhttp) Nov 7 10:02:17 Tower kernel: REISERFS warning (device md1): sh-2021 reiserfs_fill_super: can not find reiserfs on md1 (Minor Issues) Nov 7 10:02:17 Tower kernel: REISERFS warning (device md2): sh-2021 reiserfs_fill_super: can not find reiserfs on md2 (Minor Issues) Nov 7 10:02:18 Tower emhttp: shcmd (41): /usr/local/sbin/emhttp_event disks_mounted (Other emhttp) Nov 7 10:02:18 Tower emhttp_event: disks_mounted (Other emhttp) Nov 7 10:02:18 Tower kernel: mdcmd (14): check CORRECT (unRAID engine) Nov 7 10:02:18 Tower kernel: md: recovery thread woken up ... (unRAID engine) Nov 7 10:02:18 Tower kernel: md: recovery thread syncing parity disk ... (unRAID engine) Nov 7 10:02:18 Tower kernel: md: using 1536k window, over a total of 1953514552 blocks. (unRAID engine) Nov 7 10:02:19 Tower emhttp: shcmd (42): :>/etc/samba/smb-shares.conf (Other emhttp) Nov 7 10:02:19 Tower avahi-daemon[1247]: Files changed, reloading. Nov 7 10:02:19 Tower emhttp: Restart SMB... (Other emhttp) Nov 7 10:02:19 Tower emhttp: shcmd (43): killall -HUP smbd (Minor Issues) Nov 7 10:02:19 Tower emhttp: shcmd (44): cp /etc/avahi/services/smb.service- /etc/avahi/services/smb.service (Other emhttp) Nov 7 10:02:19 Tower avahi-daemon[1247]: Files changed, reloading. Nov 7 10:02:19 Tower avahi-daemon[1247]: Service group file /services/smb.service changed, reloading. Nov 7 10:02:19 Tower emhttp: shcmd (45): ps axc | grep -q rpc.mountd (Other emhttp) Nov 7 10:02:19 Tower emhttp: _shcmd: shcmd (45): exit status: 1 (Other emhttp) Nov 7 10:02:19 Tower emhttp: shcmd (46): /usr/local/sbin/emhttp_event svcs_restarted (Other emhttp) Nov 7 10:02:19 Tower emhttp_event: svcs_restarted (Other emhttp) syslog-2014-11-08_1.txt Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 If it were me, I would start looking at the cables. Make sure there is no interference or poor paths. Maybe even replace them with other spares. I might even try re-arranging the cables. It does sound odd to me that it's taking so long. It should not be that slow. However, nothing jumped out at me in the syslog. if you decide to stop the parity check and re-arrange, make sure you configure the array not to start. Then you can check your raw hdparm speeds without interference. You can also use dd to check your speeds like this. To see disks. fdisk -l | grep 'Disk /' where ? = drive letter. 400MB read test -> dd of=/dev/null bs=4096 count=102400 if=/dev/sd? 4.2GB read test -> dd of=/dev/null bs=4096 count=102400 if=/dev/sd? here's an example of mine root@unRAIDx:/mnt/disk3/filedb# fdisk -l | grep 'Disk /' Disk /dev/sda: 1073 MB, 1073741824 bytes Disk /dev/sdb: 4000.8 GB, 4000787030016 bytes Disk /dev/sdc: 3000.6 GB, 3000592982016 bytes Disk /dev/sdd: 4000.8 GB, 4000787030016 bytes Disk /dev/sde: 4000.8 GB, 4000787030016 bytes Disk /dev/sdg: 15.9 GB, 15937306624 bytes root@unRAIDx:/mnt/disk3/filedb# dd of=/dev/null bs=4096 count=102400 if=/dev/sde 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 3.38214 s, 124 MB/s root@unRAIDx:/mnt/disk3/filedb# dd of=/dev/null bs=4096 count=1024000 if=/dev/sde 1024000+0 records in 1024000+0 records out 4194304000 bytes (4.2 GB) copied, 28.7707 s, 146 MB/s Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 Hi, Thanks for that . I've decided to stop the parity check and preclear all disks instead. 2 are preforming ok (120mb/s) but one is very very slow (about 7 mb/s) Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 Hi, Thanks for that . I've decided to stop the parity check and preclear all disks instead. 2 are preforming ok (120mb/s) but one is very very slow (about 7 mb/s) Given the same prior circumstances with hardware, I would not expect the speed to change. That's why I gave you the quick read test. You cannot expect unraid to do anything faster then a linux/unix based dd read. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 Given the same prior circumstances with hardware, I would not expect the speed to change. That's why I gave you the quick read test. You cannot expect unraid to do anything faster then a linux/unix based dd read. Did as you suggested and my sdb drive is definitely under performing. 3.9mb/s compared to 45 and 49mb/s for the other 2: Disk /dev/sda: 7851 MB, 7851737088 bytes Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes Disk /dev/sdc: 2000.4 GB, 2000398934016 bytes root@Tower:~# dd of=/dev/null bs=4096 count=102400 if=/dev/sdd 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 9.24476 s, 45.4 MB/s root@Tower:~# dd of=/dev/null bs=4096 count=102400 if=/dev/sdc 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 8.50575 s, 49.3 MB/s root@Tower:~# dd of=/dev/null bs=4096 count=102400 if=/dev/sdb 102400+0 records in 102400+0 records out 419430400 bytes (419 MB) copied, 107.331 s, 3.9 MB/s Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 Move it to another slot and/or change the cable, you'll need to isolate if it's the drive itself. The other choice is to run the manufacturer's diagnostic software or trigger a smart long test. It may not be the drive, it may be the controller, cable, connector, etc. The SMART long test will read every sector internally in the drive. Thus avoiding the interface. Quote Link to comment
ives Posted November 8, 2014 Author Share Posted November 8, 2014 thanks. When I've finished preclearing the other 2 drives, I'll power down and mess about with the cables. The problematic drive in question sdb has the identifier WDC_WD20EZRX-00D8PB0_WD-WCC4M9YLCTRE. Will this be written on the drive somewhere? Presumably this is a serila number? Quote Link to comment
WeeboTech Posted November 8, 2014 Share Posted November 8, 2014 thanks. When I've finished preclearing the other 2 drives, I'll power down and mess about with the cables. The problematic drive in question sdb has the identifier WDC_WD20EZRX-00D8PB0_WD-WCC4M9YLCTRE. Will this be written on the drive somewhere? Presumably this is a serila number? WDC_WD20EZRX-00D8PB0_WD-WCC4M9YLCTRE. Serial Number should be on a sticker on the drive. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.