reggierat Posted April 18, 2015 Share Posted April 18, 2015 The issue first popped up in B14 after doing some maintenance on my array. 2 x 4TB drives were purchased, one of these was used to replace by 3tb parity drive, and then my two oldest drives were replaced with these 2 drives, the 3tb and 4tb. I also used this opportunity to migrate all array disks to XFS. It was only after moved to XFS that i started noticing the problem, up until then b14 had been very good. I have now moved to beta15 and the issue still persists. As you can see from the attached screenshot all array drives are spun down, disk 3 is still reported it's temperature and it is also reporting 'disk activity' in the syslog when i turn on debugging for s3 sleep. All cabling has been doubled checked, and i replaced the sata cable on disk 3 If i manually spin up all drives and then manually spin them down then the issue goes away. It was is most often reported on disk 3 but occasionally disk 1 and disk 2 so i don't really think the drive is at fault. smart report for affected drive Disk 3 attached to port: sde ID# ATTRIBUTE NAME FLAG VALUE WORST THRESH TYPE UPDATED FAILED RAW VALUE 1 Raw Read Error Rate 0x000b 100 100 016 Pre-fail Always Never 0 2 Throughput Performance 0x0005 136 136 054 Pre-fail Offline Never 93 3 Spin Up Time 0x0007 135 135 024 Pre-fail Always Never 406 (Average 407) 4 Start Stop Count 0x0012 099 099 000 Old age Always Never 7950 5 Reallocated Sector Ct 0x0033 100 100 005 Pre-fail Always Never 0 7 Seek Error Rate 0x000b 100 100 067 Pre-fail Always Never 0 8 Seek Time Performance 0x0005 146 146 020 Pre-fail Offline Never 29 9 Power On Hours 0x0012 097 097 000 Old age Always Never 24437 10 Spin Retry Count 0x0013 100 100 060 Pre-fail Always Never 0 12 Power Cycle Count 0x0032 100 100 000 Old age Always Never 1380 192 Power-Off Retract Count 0x0032 094 094 000 Old age Always Never 7951 193 Load Cycle Count 0x0012 094 094 000 Old age Always Never 7951 194 Temperature Celsius 0x0002 200 200 000 Old age Always Never 30 (Min/Max 8/46) 196 Reallocated Event Count 0x0032 100 100 000 Old age Always Never 0 197 Current Pending Sector 0x0022 100 100 000 Old age Always Never 0 198 Offline Uncorrectable 0x0008 100 100 000 Old age Offline Never 0 199 UDMA CRC Error Count 0x000a 200 200 000 Old age Always Never 378 syslog.zip Link to comment
reggierat Posted April 19, 2015 Author Share Posted April 19, 2015 Just observing things again today, I don't understand the inner workings enough to know for sure but it seems the disk is getting out of sync with the others in regards to its temp and spun up/down status. When the server is woken up after a period of time all the disks are getting spun up except disk 3 (because its currently empty and the mover hasn't chosen to start writing to it yet). After the polling_attribute timer all disk temps are read including the disk that is spun down (disk 3). Now after some inactivity all disks will be spun down and temps will stop being reported by the GUI. The drive that was not spun up will continue to report its temp until I finally spin it up manually and either let it spin down on its own or spin it foen manually. Link to comment
reggierat Posted April 23, 2015 Author Share Posted April 23, 2015 issue still persists, all disks will be spun down in the GUI, but 1 disk (sde) will still be reporting 'disk activity' according to the debug for s3_sleep. Does anyone have any idea how this is possible if the disk is spun down? Link to comment
jonp Posted April 24, 2015 Share Posted April 24, 2015 issue still persists, all disks will be spun down in the GUI, but 1 disk (sde) will still be reporting 'disk activity' according to the debug for s3_sleep. Does anyone have any idea how this is possible if the disk is spun down? Please disable all plugins, run tests again, and report back with another syslog. You have a lot of plugins installed which may have something to do with this. Link to comment
reggierat Posted April 24, 2015 Author Share Posted April 24, 2015 here is a syslog with docker disabled and all plugins removed except s3_sleep. I have kept this installed as this is what im using to check on the mystery disk activity In this syslog, rebooted server, waited 5mins, manually spun down all disks, observed that disks were spun down, manually put server to sleep, manually woke up using WOL, checked all disks were still showing as spun down. At 7.05am enabled s3 debug logging and disks are still spun down but disk activity is being reported. This is in fact before any temps have been taken so this may have just been confusing the issue previously. To summarize the issue occurs if a disk has not been spun up after waking from sleep. Even though the disk is spun down in the gui it incorrectly is reporting disk activity for some reason. Manually spinning the drive(s) up always fixes the issue. syslog.zip Link to comment
reggierat Posted April 24, 2015 Author Share Posted April 24, 2015 As a bandaid Im thinking I just spin up all drives on wake. Could someone help me with the custom command to do this i tested this from the console which spins up all drives for disknum in 0 `ls /dev/md* | sed "sX/dev/mdXX"`; do /root/mdcmd spinup $disknum; done but adding this to the area for commands after wake does nothing Link to comment
RobJ Posted April 25, 2015 Share Posted April 25, 2015 Well, I've looked at both syslogs, and I can't honestly say I see any problems. In the early days of user experimentation with S3 sleep and waking, there were so many errors, I have no idea how it ever worked! In yours, there are problems trying to recover the drives, but it does seem to work, somehow. I'm sure you know that LimeTech does not, cannot support S3 sleep. I don't think I see anything to worry about. There has been a known issue with drives spin status and temps being sometimes out of sync with reality, a minor glitch. Link to comment
reggierat Posted April 25, 2015 Author Share Posted April 25, 2015 I realise it is not officially supported, but this was working perfectly up until b14. The 'disk activity' that occurs prevents the server from going back to sleep if the above conditions have been met. ie server wakes up but a particular drive has not been spun up before the server is due to go back to sleep. if i could just get a post wake command to spin all drives up as a work around this would atleast get it behaving as it was before the bug started occurring Link to comment
reggierat Posted April 27, 2015 Author Share Posted April 27, 2015 In the end i modified the s3 sleep script to spin up all my drives when the server comes out of sleep. This is atleast a work around for the issue Link to comment
Wally Posted May 8, 2015 Share Posted May 8, 2015 reggierat, How did you modify the S3 sleep script? I'm having a similar problem as my drives are spun up after coming out of S3 sleep but unRAID thinks they are spun down since that's how they were just before S3. Because of this, my server won't go to sleep again unless I either do a manual spinup or spindown to sync the status of the drives and unRAID. I tried adding post S3 commands to spinup with no success. Link to comment
bonienl Posted May 9, 2015 Share Posted May 9, 2015 As a bandaid Im thinking I just spin up all drives on wake. Could someone help me with the custom command to do this i tested this from the console which spins up all drives for disknum in 0 `ls /dev/md* | sed "sX/dev/mdXX"`; do /root/mdcmd spinup $disknum; done but adding this to the area for commands after wake does nothing I tested this command by putting it in the "Custom commands after wake-up" box and it is working fine for me. All my disks get spun-up after waking-up. This is also the preferred method instead of changing in the code of the s3_sleep script itself. What exactly did you do ? I enabled debug mode to the syslog, and this is given: May 9 07:15:31 vesta s3_sleep: Wake-up now May 9 07:15:31 vesta s3_sleep: Execute custom commands after wake-up May 9 07:15:31 vesta kernel: Restarting tasks ... done. May 9 07:15:31 vesta kernel: mdcmd (154): spinup 0 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (155): spinup 1 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (156): spinup 10 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (157): spinup 2 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (158): spinup 3 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (159): spinup 4 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (160): spinup 5 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (161): spinup 6 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (162): spinup 7 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (163): spinup 8 May 9 07:15:31 vesta kernel: May 9 07:15:31 vesta kernel: mdcmd (164): spinup 9 Link to comment
Wally Posted May 9, 2015 Share Posted May 9, 2015 I just tried the spinup command posted by reggierat and it works if you've let the S3 sleep script do the sleeping but not if done manually thru the webpage as expected. My server is now sleeping normally. Thanks all. Link to comment
reggierat Posted May 9, 2015 Author Share Posted May 9, 2015 Boniel: sorry for not seeing your previous reply, i realise i shouldn't be changing things i don't properly understand but it did provide me a work around to this current bug. I think perhaps i was manually sleeping the server to test it and thus not getting the code to execute. Trying again with it in the post commands section and letting it sleep properly. Wally: were you experiencing the same bug? where the server would not sleep due to disk activity even though all drives were spun down? Link to comment
Wally Posted May 9, 2015 Share Posted May 9, 2015 reggierat, My problem was exactly the same as you described including the disk activity in the logs and messed up temp display. Before S3 Sleep activates, the drives are spun down but when awoken, all drives are spun up as power is applied to them. The problem is that unRAID still thinks they are spun down so never starts its timer to spin them down and S3 Sleep checks the drives directly and sees if any are spinning and thus the constant log entry of drive activity and reset of its timer. In my case, drive sdb showed activity until I accessed file on it and unRAID spun it down normally then sdc began showing up as active in the logs. Either spinning all the drives up or down manually will sync the drives actual status to unRAID and allow S3 Sleep to work properly as you have noticed. The command you posted works perfectly as long as you let S3 Sleep do the sleeping as doing it manually is direct and bypasses everything. Link to comment
bonienl Posted May 10, 2015 Share Posted May 10, 2015 Thanks for explaining the issue and providing a solution (workaround). Not sure if this can be solved in a proper way, since LT does not officially support s3 sleep, I keep it in mind however. Link to comment
jonp Posted May 12, 2015 Share Posted May 12, 2015 Just moved this to general support. This isn't a bug or a defect, but rather, a consequence of using an unsupported plugin. I appreciate the desire for us to support S3 sleep on unRAID, but it is not a feature on the roadmap for 6.0 at this time. A feature request could be posted (or if one exists, add your support for it), but it's not a priority for us to incorporate this into unRAID in the near-term. Link to comment
reggierat Posted May 16, 2015 Author Share Posted May 16, 2015 FWIW this bug is still present in RC-1. I understand your position on this Jonp, but given the bug was introduced in the last 2 beta's i'm hoping that a future update might resolve this? Atleast we have a work around to keep this plugin working. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.