gatorguy Posted July 20, 2013 Share Posted July 20, 2013 I've read a lot of threads before posting, but I have been unable to find an answer. My array has a parity drive, 3 data drives and a cache drive. My disk 2 spins constantly and has been for weeks. If I manually spin down all drives, disk 2 will start back within 30-90 seconds. My unRAID server is approx 1.5 years old. I've tried: inotifywait -mr /mnt/disk2 (saw no activity) lsof /mnt/* (saw only cache activity) sysctl vm.block_dump=1 followed by tail -f /var/log/syslog (saw no read/write activity that I didn't cause myself by accessing a .pdf reads/writes on disk2 don't change in myMAIN view The only thing I see in the syslog that I think may be related is this section which shows me spinning down and then what happens when drive starts back up: Jul 20 10:38:50 GatorguyMedia emhttp: Spinning down all drives... Jul 20 10:38:50 GatorguyMedia kernel: mdcmd (39): spindown 0 Jul 20 10:38:51 GatorguyMedia kernel: mdcmd (40): spindown 1 Jul 20 10:38:52 GatorguyMedia kernel: mdcmd (41): spindown 2 Jul 20 10:38:53 GatorguyMedia kernel: mdcmd (42): spindown 3 Jul 20 10:38:54 GatorguyMedia emhttp: shcmd (49): /usr/sbin/hdparm -y /dev/hda &> /dev/null Jul 20 10:39:16 GatorguyMedia kernel: shfs(31921): dirtied inode 4026531842 (mounts) on proc Jul 20 10:40:36 GatorguyMedia kernel: ata3: exception Emask 0x10 SAct 0x0 SErr 0x10202 action 0xe frozen Jul 20 10:40:36 GatorguyMedia kernel: ata3: irq_stat 0x00400000, PHY RDY changed Jul 20 10:40:36 GatorguyMedia kernel: ata3: SError: { RecovComm Persist PHYRdyChg } Jul 20 10:40:36 GatorguyMedia kernel: ata3: hard resetting link Jul 20 10:40:46 GatorguyMedia kernel: ata3: softreset failed (1st FIS failed) Jul 20 10:40:46 GatorguyMedia kernel: ata3: hard resetting link Jul 20 10:40:47 GatorguyMedia kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jul 20 10:40:47 GatorguyMedia kernel: ata3.00: configured for UDMA/133 Jul 20 10:40:47 GatorguyMedia kernel: ata3: EH complete I thought maybe it was a hardware issue. I checked the cable connections last night, but didn't replace a cable. Any other thoughts? I was running 5.0-rc12a until last night when I updated to 5.0-rc16c. The "issue" happens on both versions. Thanks in advance for your help! Link to comment
JonathanM Posted July 20, 2013 Share Posted July 20, 2013 Spin down the drive, then immediately unplug the network cable from the unraid box. If the disk spins back up, the offending bit is likely to be an addon. If it stays spun down, that means one (or more) of your client devices on the network is causing it. You should be able to narrow it down from there by process of elimination. Link to comment
gatorguy Posted July 20, 2013 Author Share Posted July 20, 2013 Thanks for the reply. I'm willing to try anything. My addons are all installed on the cache drive. Only have couchpotato, sabnzbd, sickbeard and twonky. Here comes a dumb question. How can I find out if the drive is spinning when I have the network disconnected? I can hook up a monitor to the unRAID box, but what command would I use to see if the disk is spinning? Link to comment
JonathanM Posted July 20, 2013 Share Posted July 20, 2013 I usually can hear a drive spin up. You may have to use the old mechanics screwdriver trick, but you should be able to hear the motor spin up. (screwdriver trick consists of holding one end of a screwdriver or other similar object against the chassis and the other end against your ear) Link to comment
gatorguy Posted July 20, 2013 Author Share Posted July 20, 2013 I did as you suggested and the drive spun up as it has been doing. I copied the syslog and you can see the network link go down, the same messages I posted above and then the network link coming up after I plugged the cable in. Do those error messages have anything to do with addons? Syslog: Jul 20 12:20:27 GatorguyMedia emhttp: Spinning down all drives... (Other emhttp) Jul 20 12:20:27 GatorguyMedia kernel: mdcmd (47): spindown 0 (Routine) Jul 20 12:20:28 GatorguyMedia kernel: mdcmd (48): spindown 1 (Routine) Jul 20 12:20:29 GatorguyMedia kernel: mdcmd (49): spindown 2 (Routine) Jul 20 12:20:29 GatorguyMedia kernel: mdcmd (50): spindown 3 (Routine) Jul 20 12:20:30 GatorguyMedia emhttp: shcmd (51): /usr/sbin/hdparm -y /dev/hda $stuff$> /dev/null (Drive related) Jul 20 12:20:32 GatorguyMedia kernel: r8168: eth0: link down (Network) Jul 20 12:21:56 GatorguyMedia kernel: ata3: exception Emask 0x10 SAct 0x0 SErr 0x90202 action 0xe frozen (Errors) Jul 20 12:21:56 GatorguyMedia kernel: ata3: irq_stat 0x00400000, PHY RDY changed (Drive related) Jul 20 12:21:56 GatorguyMedia kernel: ata3: SError: { RecovComm Persist PHYRdyChg 10B8B } (Errors) Jul 20 12:21:56 GatorguyMedia kernel: ata3: hard resetting link (Minor Issues) Jul 20 12:22:06 GatorguyMedia kernel: ata3: softreset failed (1st FIS failed) (Minor Issues) Jul 20 12:22:06 GatorguyMedia kernel: ata3: hard resetting link (Minor Issues) Jul 20 12:22:08 GatorguyMedia kernel: ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 20 12:22:08 GatorguyMedia kernel: ata3.00: configured for UDMA/133 (Drive related) Jul 20 12:22:08 GatorguyMedia kernel: ata3: EH complete (Drive related) Jul 20 12:22:30 GatorguyMedia kernel: r8168: eth0: link up (Network) Link to comment
gatorguy Posted July 20, 2013 Author Share Posted July 20, 2013 Ok, I have tried a few different things: Traded SATA cables with another drive. Disk 2 still would not stay spun down. Traded SATA ports on the motherboard with another drive. Disk 2 still would not stay spun down. (notice the error is now ata4) Replaced SATA cable and left Disk 2 on ata4 and still would not stay spun down. Disk 2 is starting to become a common denominator... Finally connected Disk 2 via PCI-e SAS RAID Controller which has its own cable. Took almost 10 minutes this time, but drive spun up. See log below. BTW, had inotifywait -mr /mnt/disk2 running and had no activity when the disk spun up. Syslog with New cable connected to ata4 on motherboard: Jul 20 14:03:33 GatorguyMedia emhttp: Spinning down all drives... (Other emhttp) Jul 20 14:03:33 GatorguyMedia kernel: mdcmd (19): spindown 0 (Routine) Jul 20 14:03:34 GatorguyMedia kernel: mdcmd (20): spindown 1 (Routine) Jul 20 14:03:35 GatorguyMedia kernel: mdcmd (21): spindown 2 (Routine) Jul 20 14:03:36 GatorguyMedia kernel: mdcmd (22): spindown 3 (Routine) Jul 20 14:03:37 GatorguyMedia emhttp: shcmd (44): /usr/sbin/hdparm -y /dev/hda $stuff$> /dev/null (Drive related) Jul 20 14:05:15 GatorguyMedia kernel: ata4: exception Emask 0x10 SAct 0x0 SErr 0x10202 action 0xe frozen (Errors) Jul 20 14:05:15 GatorguyMedia kernel: ata4: irq_stat 0x00400000, PHY RDY changed (Drive related) Jul 20 14:05:15 GatorguyMedia kernel: ata4: SError: { RecovComm Persist PHYRdyChg } (Errors) Jul 20 14:05:15 GatorguyMedia kernel: ata4: hard resetting link (Minor Issues) Jul 20 14:05:25 GatorguyMedia kernel: ata4: softreset failed (1st FIS failed) (Minor Issues) Jul 20 14:05:25 GatorguyMedia kernel: ata4: hard resetting link (Minor Issues) Jul 20 14:05:26 GatorguyMedia kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) (Drive related) Jul 20 14:05:26 GatorguyMedia kernel: ata4.00: configured for UDMA/133 (Drive related) Jul 20 14:05:26 GatorguyMedia kernel: ata4: EH complete (Drive related) Syslog when connected to PCI-e controller: Jul 20 15:03:58 GatorguyMedia emhttp: Spinning down all drives... (Other emhttp) Jul 20 15:03:58 GatorguyMedia kernel: mdcmd (19): spindown 0 (Routine) Jul 20 15:03:59 GatorguyMedia kernel: mdcmd (20): spindown 1 (Routine) Jul 20 15:04:00 GatorguyMedia kernel: mdcmd (21): spindown 2 (Routine) Jul 20 15:04:01 GatorguyMedia kernel: mdcmd (22): spindown 3 (Routine) Jul 20 15:04:02 GatorguyMedia emhttp: shcmd (44): /usr/sbin/hdparm -y /dev/hda $stuff$> /dev/null (Drive related) Jul 20 15:13:44 GatorguyMedia kernel: mvsas 0000:02:00.0: Phy2 : No sig fis (Drive related) Jul 20 15:13:49 GatorguyMedia kernel: mvsas 0000:02:00.0: Phy2 : No sig fis (Drive related) Jul 20 15:13:54 GatorguyMedia kernel: sas: sas_form_port: phy2 belongs to port0 already(1)! (Drive related) So is this a drive issue of some sort? Link to comment
dgaschk Posted July 21, 2013 Share Posted July 21, 2013 The log fragment, out of context, is not meaningful. Post the entire log. zip if needed. Link to comment
gatorguy Posted July 21, 2013 Author Share Posted July 21, 2013 Thanks for trying to help. Sorry I didn't post the whole log. Here it is along with a SMART report from drive 2. syslog-2013-07-21.zip smart.txt Link to comment
dgaschk Posted July 21, 2013 Share Posted July 21, 2013 One of the add-ons must be accessing the drive. Disable half of the add-ons until the culprit is determined. Link to comment
gatorguy Posted July 21, 2013 Author Share Posted July 21, 2013 I'm asking this question out of ignorance: if one of the add-ons was accessing the drive, would I see the activity while watching a Putty window with the command inotifywait -mr /mnt/disk2? I've had such a window open all day and have seen no activity. When I manually spin down, the drive spins back up within 90 seconds and puts the lines listed above in the syslog. Link to comment
dgaschk Posted July 22, 2013 Share Posted July 22, 2013 The log posted does not contain those lines. Those lines describe a communications error with a drive. They are normal as the system starts up but they should not repeat. It could be a loose or bad SATA cable. Link to comment
gatorguy Posted July 22, 2013 Author Share Posted July 22, 2013 The log posted doesn't contain the original lines I posted earlier in the thread because I rebooted the server several times as I powered down multiple times to try different cable/bus positions before saving the syslog to file. You've given me some good information. If it's a communication error with the drive, maybe there's a problem on the drive's connectors themselves. I will also try a third cable just to see if that helps. I guess the communication error can keep the drive from staying spun down? I appreciate you trying to help out! Link to comment
gatorguy Posted August 3, 2013 Author Share Posted August 3, 2013 I bought another new cable and installed on disk 2. I attached my syslog after the new cable just in case somebody sees something new. At this point, I'm assuming it's something with the drive and I'll just replace it when it dies. I've stressed too much over the damn thing. Thanks to those who gave suggestions. syslog-2013-08-03.zip Link to comment
gatorguy Posted September 8, 2013 Author Share Posted September 8, 2013 As an update, I replaced the drive in question with a WD Red 3TB. The new drive is staying spun down like it is supposed to, oh yeah! I still don't know the cause of the old 2TB staying spun up, but I can only imagine it has something to do with the drive itself. Either way, it's beyond me. Link to comment
mr-hexen Posted September 8, 2013 Share Posted September 8, 2013 use the 2TB that doesnt spin down as your cache drive! Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.