June 13, 201313 yr After a good couple of years running unRAID I've finally come across my first proper problem. My server is on 24/7, but it was recently switched off for a week while we went on holiday. On returning at the weekend, I rebooted the server but it was only as couple of days ago that I noticed nothing was playable. I could browse to yadis, select a file but it would freeze and judder within 1 or 2 seconds of starting. I then went into the web client and discovered the parity drive was showing 300 errors and it was red-balled. I rebooted the server, and when it powered up the web client showed the following screens: I then shut down, took out the parity drive and ran seatools on my main PC. A Smart short test came up clear. I put the parity drive back in the server, double checked all cable connections, booted up and it then appeared as a new parity drive. From telnet I ran another smart test and it came back fine. Here's some other stats from the drive: smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST32000542AS Serial Number: 5XW12G2N Firmware Version: CC35 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Jun 12 00:34:17 2013 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 653) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x103f) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 117 099 006 Pre-fail Always - 152696516 3 Spin_Up_Time 0x0003 100 100 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 099 099 020 Old_age Always - 1852 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 080 060 030 Pre-fail Always - 105442662 9 Power_On_Hours 0x0032 077 077 000 Old_age Always - 20603 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 84 183 Runtime_Bad_Block 0x0032 089 089 000 Old_age Always - 11 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 099 000 Old_age Always - 4295032833 189 High_Fly_Writes 0x003a 081 081 000 Old_age Always - 19 190 Airflow_Temperature_Cel 0x0022 068 049 045 Old_age Always - 32 (Lifetime Min/Max 29/32) 194 Temperature_Celsius 0x0022 032 051 000 Old_age Always - 32 (0 14 0 0) 195 Hardware_ECC_Recovered 0x001a 042 034 000 Old_age Always - 152696516 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 1 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 157603824931142 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2237232854 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 2454616228 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. At this point I just thought the whole incident was a freak caused by a loose cable. Overnight the parity drive has rebuilt itself and everything is showing fine. However...I've just tried transferring a file to the server and I'm getting 1mbps which is pathetic....so something is clearly wrong. The question is...what's wrong?? My parity drive says it's fine, unraid says it's fine, and the slow transfer speed occurs regardless of which data drive I try to write to, which still suggests that there's something up with the parity drive. I do have a spare drive if that helps, but what do I do now? I can see a couple of red errors in the current syslog: Jun 12 00:14:19 Tower kernel: [790]: scst: __scst_register_target_template:253:***ERROR***: Target driver mvst_scst already registered Jun 12 00:14:19 Tower kernel: [790]: scst: __scst_register_target_template:293:***ERROR***: Failed to register target template mvst_scst But it appears the errors from when the drive crashed are too far back in the syslog to appear? Any ideas what's going on? All advice would be appreciated (running 4.7). Thanks
June 13, 201313 yr Author Update: parity drive is showing no errors after rebuilding yesterday, but something is still wrong. Transfer speeds are pitiful, and every file freezes within 2 seconds when I try and play them (Bluray files). Just tried a DVD rip and that worked fine.
June 13, 201313 yr I am fairly sure that the experts are going to want a syslog to be able to provide you with some answers. Since you are running unMENU, it is a simple matter to download one to your computer. (Click on the 'Syslog' tab and look for the link.)
June 14, 201313 yr Author Here's the syslog from today. I ran another parity check which came up clean yesterday. All looks good, but nothing other than low bandwidth files are working. And transfer speeds are still shockingly low. I'm stumped. The transfer speed is also regardless of which drive on my server than I'm writing to. syslog-2013-06-14.txt
June 14, 201313 yr What do you see when you click on unMENU's System-Info->Ethernet-Info button? Joe L.
June 14, 201313 yr I can see a couple of red errors in the current syslog: Jun 12 00:14:19 Tower kernel: [790]: scst: __scst_register_target_template:253:***ERROR***: Target driver mvst_scst already registered Jun 12 00:14:19 Tower kernel: [790]: scst: __scst_register_target_template:293:***ERROR***: Failed to register target template mvst_scst That just indicates you have two of the same disk controller cards, and the driver used was registered by the first one. When the second one initialized, it already found its driver registered in the kernel. It is not really an error in this case. (just happens to have the word "ERRROR" on the line in the log.)
June 14, 201313 yr Author What do you see when you click on unMENU's System-Info->Ethernet-Info button? Joe L. Hi Joe This is what it says. Speed is 10MB/s..or am I reading that incorrectly?? NIC info (from ethtool) Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: 10Mb/s Duplex: Full Port: MII PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbg Wake-on: g Current message level: 0x00000033 (51) Link detected: yes NIC driver info (from ethtool -i) driver: r8169 version: 2.3LK-NAPI firmware-version: bus-info: 0000:04:00.0 Ethernet config info (from ifconfig) eth0 Link encap:Ethernet HWaddr 20:cf:30:8d:9c:2e inet addr:192.168.0.20 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST NOTRAILERS RUNNING MULTICAST MTU:1500 Metric:1 RX packets:113219 errors:0 dropped:0 overruns:0 frame:0 TX packets:59083 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8994103 (8.5 MiB) TX bytes:60292741 (57.4 MiB) Interrupt:28 Base address:0xe000
June 14, 201313 yr Just for information... The parity drive is not involved in any way when you are playing media files. It would not affect your ability to play a blue-ray file vs. a DVD file. Something else, probably in the networking, is involved. Well.... that's your problem. 10Mb/s Your network-cable/router/LAN port negotiated a 10Mb/s link. That would slow you down. It could be the bad/poorly initialized port on the router/switch you are connected to, or a defective cable to the switch, or a cable not fully plugged in, or a bad NIC port on your server. Joe L.
June 14, 201313 yr Author Thanks Well this is interesting. I've just run the network cable through a tester and it's fine. I've swapped ports on the network switch with my main desktop - in any of the ports the desktop retains 1Gb connection, but the server remains on 10Mb/s. Does this mean the NIC has gone wrong in the server? Is there any way to verify this? Never had this problem with any hardware before Thanks Matt
June 14, 201313 yr Thanks >>>>>> Well this is interesting. I've just run the network cable through a tester and it's fine. <<<<<<<< The 'best' tester is to replace the cable. If that doesn't fix the problem and you are sure the switch is good, the NIC could be bad. (I believe it is a bit unusual that you dropped clear down to 10Mb/s. Most of the time, it falls back to 100Mb/s.)
June 14, 201313 yr Author Just tried another cable and it's the same. So far I've done the following: * Tried several ports on the switch * Swapped cables * Rebooted router * Rebooted switches Is there anything worth looking at in the bios of the server?
June 14, 201313 yr Is there anything worth looking at in the bios of the server? I doubt it. Your next best move is to figure out what type of slots you have free, PCI or PCIe, and getting a network card to fit. PCIe is preferable.
June 14, 201313 yr Just tried another cable and it's the same. So far I've done the following: * Tried several ports on the switch * Swapped cables * Rebooted router * Rebooted switches Is there anything worth looking at in the bios of the server? You did reboot the server with each change?
June 14, 201313 yr Just to verify. I know you said you replaced the cable... but did you attach the ends? Or is it a factory made cable? I know you tested it, but for continuity? The usual inexpensive type of tester does not check if you've properly wired pairs of conductors, they only check for continuity. And lastly, forget all these questions about the cable wiring if the same cable worked before you went away on holiday. I'd say an inexpensive LAN card is in your future. Joe L.
June 18, 201313 yr Author Thanks for your assistance on this. I ordered a gigabit network card from amazon for £5 and after plugging it in it worked right off the bat (no need to disable the onboard LAN on my motherboard). 1000Mb/s and everything is streaming fine Not sure what the issue was originally with the missing disc, all I can think of is that a cable wasn't seated properly, but all is well now
Archived
This topic is now archived and is closed to further replies.